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PLANT BIOCHEMISTRY-RELATED GENES 

RELATED APPLICATION INFORMATION 

The present invention claims the benefit from US Provisional Patent Application Serial 

5 Nos. 60/166,228 filed November 17, 1999 and 60/197,899 filed April 17, 2000 and "Plant Trait 
Modification 111" filed August 22, 2000. 

FIELD OF THE INVENTION 

This invention relates to the field of plant biology. More particularly, the present 

invention pertains to compositions and methods for phenotypically modifying a plant. 

1 0 BACKGROUND OF THE INVENTION 

Transcription factors can modulate gene expression, either increasing or decreasing 

(inducing or repressing) the rate of transcription. This modulation results in differential levels of 
gene expression at various developmental stages, in different tissues and cell types, and in 
response to different exogenous (e.g., environmental) and endogenous stimuli throughout the life 

1 5 cycle of the organism. 

Because transcription factors are key controlling elements of biological pathways, 
altering the expression levels of one or more transcription factors can change entire biological 
pathways in an organism. For example, manipulation of the levels of selected transcription 
factors may result in increased expression of economically useful proteins or metabolic chemicals 

20 in plants or to improve other agriculturally relevant characteristics. Conversely, blocked or 

reduced expression of a transcription factor may reduce biosynthesis of unwanted compounds or 
remove an undesirable trait. Therefore, manipulating transcription factor levels in a plant offers 
tremendous potential in agricultural biotechnology for modifying a plant's traits. 

The present invention provides novel transcription factors useful for modifying a plant's 

25 phenotype in desirable ways, such as modifying a plant's biochemical traits. 

SUMMARY OF THE INVENTION 

In a first aspect, the invention relates to a recombinant polynucleotide comprising a 

nucleotide sequence selected from the group consisting of: (a) a nucleotide sequence encoding a 
polypeptide comprising a sequence selected from SEQ ID Nos. 2N, where N=l-22, or a 
30 complementary nucleotide sequence thereof; (b) a nucleotide sequence encoding a polypeptide 
comprising a conservatively substituted variant of a polypeptide of (a); (c) a nucleotide sequence 
comprising a sequence selected from those of SEQ ID Nos. 2N-1, where N=l-22, or a 
complementary nucleotide sequence thereof; (d) a nucleotide sequence comprising silent 
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substitutions in a nucleotide sequence of (c); (e) a nucleotide sequence which hybridizes under 
stringent conditions over substantially the entire length of a nucleotide sequence of one or more 
of: (a), (b), (c), or (d); (f) a nucleotide sequence comprising at least 15 consecutive nucleotides of 
a sequence of any of (a)-(e); (g) a nucleotide sequence comprising a subsequence or fragment of 
5 any of (aHf), which subsequence or fragment encodes a polypeptide having a biological activity 
that modifies a plants biochemical characteristic; (h) a nucleotide sequence having at least 31% 
sequence identity to a nucleotide sequence of any of (a)-(g); (i) a nucleotide sequence having at 
least 60% identity sequence identity to a nucleotide sequence of any of (a)-(g); (j) a nucleotide 
sequence which encodes a polypeptide having at least 31% identity sequence identity to a 

10 polypeptide of SEQ ID Nos. 2N, where N=l-22; (k) a nucleotide sequence which encodes a 

polypeptide having at least 60% identity sequence identity to a polypeptide of SEQ ED Nos. 2N, 
where N=l-22; and (1) a nucleotide sequence which encodes a conserved domain of a polypeptide 
having at least 65% sequence identity to a conserved domain of a polypeptide of SEQ ID Nos. 
2N, where N=l-22. The recombinant polynucleotide may further comprise a constitutive, 

1 5 inducible, or tissue-active promoter operably linked to the nucleotide sequence. The invention 
also relates to compositions comprising at least two of the above described polynucleotides. 

In a second aspect, the invention is an isolated or recombinant polypeptide comprising a 
subsequence of at least about 15 contiguous amino acids encoded by the recombinant or isolated 
polynucleotide described above. 

20 In another aspect, the invention is a transgenic plant comprising one or more of the above 

described recombinant polynucleotides. In yet another aspect, the invention is a plant with 
altered expression levels of a polynucleotide described above or a plant with altered expression or 
activity levels of an above described polypeptide. Further, the invention is a plant lacking a 
nucleotide sequence encoding a polypeptide described above. The plant may be a soybean, 

25 wheat, com, potato, cotton, rice, oilseed rape, sunflower, alfalfa, sugarcane, turf, banana, 

blackberry, blueberry, strawberry, raspberry, cantaloupe, carrot, cauliflower, coffee, cucumber, 
eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, pineapple, 
spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, or vegetable brassicas 
plant. 

30 In a further aspect, the invention relates to a cloning or expression vector comprising the 

isolated or recombinant polynucleotide described above or cells comprising the cloning or 
expression vector. 
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In yet a farther aspect, the invention relates to a composition produced by incubating a 
polynucleotide of the invention with a nuclease, a restriction enzyme, a polymerase; a 
polymerase and a primer; a cloning vector, or with a cell. 

Furthermore, the invention relates to a method for producing a plant having a modified 
5 biochemical trait. The method comprises altering the expression of an isolated or recombinant 
polynucleotide of the invention or altering the expression or activity of a polypeptide of the 
invention in a plant to produce a modified plant, and selecting the modified plant for a modified 
biochemical trait. 

In another aspect, the invention relates to a method of identifying a factor that is 

10 modulated by or interacts with a polypeptide encoded by a polynucleotide of the invention. The 
method comprises expressing a polypeptide encoded by the polynucleotide in a plant; and 
identifying at least one factor that is modulated by or interacts with the polypeptide. In one 
embodiment the method for identifying modulating or interacting factors is by detecting binding 
by the polypeptide to a promoter sequence, or by detecting interactions between an additional 

1 5 protein and the polypeptide in a yeast two hybrid system, or by detecting expression of a factor by 
hybridization to a microarray, subtractive hybridization or differential display. 

In yet another aspect, the invention is a method of identifying a molecule that modulates 
activity or expression of a polynucleotide or polypeptide of interest. The method comprises 
placing the molecule in contact with a plant comprising the polynucleotide or polypeptide 

20 encoded by the polynucleotide of the invention and monitoring one or more of the expression 
level of the polynucleotide in the plant, the expression level of the polypeptide in the plant, and 
modulation of an activity of the polypeptide in the plant 

In yet another aspect, the invention relates to an integrated system, computer or computer 
readable medium comprising one or more character strings corresponding to a polynucleotide of 

25 the invention, or to a polypeptide encoded by the polynucleotide. The integrated system, 

computer or computer readable medium may comprise a link between one or more sequence 
strings to a modified plant biochemical trait. 

In yet another aspect, the invention is a method for identifying a sequence similar or 
homologous to one or more polynucleotides of the invention, or one or more polypeptides 

30 encoded by the polynucleotides. The method comprises providing a sequence database; and, 
querying the sequence database with one or more target sequences corresponding to the one or 
more polynucleotides or to the one or more polypeptides to identify one or more sequence 
members of the database that display sequence similarity or homology to one or more of the one 
or more target sequences. 
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The method may further comprise of linking the one or more of the polynucleotides of 
the invention, or encoded polypeptides, to a modified plant biochemical phenotype. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 provides a table of exemplary polynucleotide and polypeptide sequences of the 

5 invention. The table includes from left to right for each sequence: the SEQ ID No., the internal 
code reference number (GID), whether the sequence is a polynucleotide or polypeptide sequence, 
and identification of any conserved domains for the polypeptide sequences. 

Figure 2 provides a table of exemplary sequences that are homologous to other sequences 
provided in the Sequence Listing and that are derived from Arabidopsis thaliana. The table 
10 includes from left to right: the SEQ ED No., the internal code reference number (GID), 
identification of the homologous sequence, whether the sequence is a polynucleotide or 
polypeptide sequence, and identification of any conserved domains for the polypeptide 
sequences. 

Figure 3 provides a table of exemplary sequences that are homologous to the sequences 
1 5 provided in Figures 1 and 2 and that are derived from plants other than Arabidopsis thaliana. The 
table includes from left to right: the SEQ ID No., the internal code reference number (GID), the 
unique GenBank sequence ID No. (NID), the probability that the comparison was generated by 
chance (P-value), and the species from which the homologous gene was identified. 



20 DETAILED DESCRIPTION 

The present invention relates to polynucleotides and polypeptides, e.g. for modifying 

phenotypes of plants. 

In particular, the polynucleotides or polypeptides are useful for modifying traits 
associated with a plant's biochemical characteristic when the expression levels of the 

25 polynucleotides or expression levels or activity levels of the polypeptides are altered. 

The polynucleotides of the invention encode plant transcription factors. The plant 
transcription factors are derived, e.g., from Arabidopsis thaliana and can belong, e.g., to one or 
more of the following transcription factor families: the AP2 (APETALA2) domain transcription 
factor family (Riechmann and Meyerowitz (1998) J. Biol. Chem. 379:633-646); the MYB 

30 transcription factor family (Martin and Paz-Ares (1 997) Trends Genet. 13:67-73); the MADS 

domain transcription factor family (Riechmann and Meyerowitz (1997) J. Biol. Chem. 378:1079- 
1 101); the WRKY protein family (Ishiguro and Nakamura (1994) Mol. Gen. Genet. 244:563- 
571); the ankyrin-repeat protein family (Zhang et al. (1992) Plant Cell 4: 1575-1588); the 
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miscellaneous protein (MISC) family (Kim et al. (1997) Plant J. 1 1:1237-1251); the zinc finger 
protein (Z) family (Klug and Schwabe (1995) FASEB J. 9: 597-604); the homeobox (HB) protein 
family (Duboule (1994) Guidebook to the Homeobox Genes, Oxford University Press); the 
CAAT-element binding proteins (Forsburg and Guarente (1 989) Genes Dev. 3:1 166-1 1 78); the 
5 squamosa promoter binding proteins (SPB) (Klein et al. (1996) Mol. Gen. Genet. 1996 250:7-16); 
the NAM protein family; the IAA/AUX proteins (Rouse et al. (1998) Science 279: 1371-1373); 
the HLH/MYC protein family (Littlewood et al. (1994) Prot. Profile 1:639-709); the DNA- 
binding protein (DBP) family (Tucker et al. (1994) EMBO I 13:2994-3002); the bZIP family of 
transcription factors (Foster et al. (1994) FASEB J. 8:192-200); the BPF-1 protein (Box P- 

10 binding factor) family (da Costa e Silva et al. (1993) Plant J. 4:125-135); and the golden protein 
(GLD) family (Hall et al. (1998) Plant Cell 10:925-936). 

In addition to methods for modifying a plant phenotype by employing one or more 
polynucleotides and polypeptides of the invention described herein, the polynucleotides and 
polypeptides of the invention have a variety of additional uses. These uses include their use in 

15 the recombinant production (i.e, expression) of proteins; as regulators of plant gene expression, as 
diagnostic probes for the presence of complementary or partially complementary nucleic acids 
(including for detection of natural coding nucleic acids); as substrates for further reactions, e.g., 
mutation reactions, PCR reactions, or the like, of as substrates for cloning e.g., including 
digestion or ligation reactions, and for identifying exogenous or endogenous modulators of the 

20 transcription factors . 

DEFINITIONS 

A "polynucleotide" is a nucleic acid sequence comprising a plurality of polymerized 
nucleotide residues, e.g., at least about 15 consecutive polymerized nucleotide residues, 
optionally at least about 30 consecutive nucleotides, at least about 50 consecutive nucleotides. In 

25 many instances, a polynucleotide comprises a nucleotide sequence encoding a polypeptide (or 
protein) or a domain or fragment thereof. Additionally, the polynucleotide may comprise a 
promoter, an intron, an enhancer region, a polyadenylation site, a translation initiation site, 5 1 or 
3' untranslated regions, a reporter gene, a selectable marker, or the like. The polynucleotide can 
be single stranded or double stranded DNA or RNA. The polynucleotide optionally comprises 

30 modified bases or a modified backbone. The polynucleotide can be, e.g., genomic DNA or RNA, 
a transcript (such as an mRNA), a cDNA, a PCR product, a cloned DNA, a synthetic DNA or 
RNA, or the like. The polynucleotide can comprise a sequence in either sense or antisense 
orientations. 
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A "recombinant polynucleotide" is a polynucleotide that is not in its native state, e.g., the 
polynucleotide comprises a nucleotide sequence not found in nature, or the polynucleotide is in a 
context other than that in which it is naturally found, e.g., separated from nucleotide sequences 
with which it typically is in proximity in nature, or adjacent (or contiguous with) nucleotide 
5 sequences with which it typically is not in proximity. For example, the sequence at issue can be 
cloned into a vector, or otherwise recombined with one or more additional nucleic acid. 

An "isolated polynucleotide" is a polynucleotide whether naturally occurring or 
recombinant, that is present outside the cell in which it is typically found in nature, whether 
purified or not. Optionally, an isolated polynucleotide is subject to one or more enrichment or 

10 purification procedures, e.g., cell lysis, extraction, centrifugation, precipitation, or the like. 

A "recombinant polypeptide" is a polypeptide produced by translation of a recombinant 
polynucleotide. An "isolated polypeptide," whether a naturally occurring or a recombinant 
polypeptide, is more enriched in (or out of) a cell than the polypeptide in its natural state in a wild 
type cell, e.g., more than about 5% enriched, more than about 10% enriched, or more than about 

15 20%, or more than about 50%, or more, enriched, i.e., alternatively denoted: 105%, 110%, 120%, 
150% or more, enriched relative to wild type standardized at 100%. Such an enrichment is not 
the result of a natural response of a wild type plant. Alternatively, or additionally, the isolated 
polypeptide is separated from other cellular components with which it is typically associated, e.g., 
by any of the various protein purification methods herein. 

20 The term "transgenic plant" refers to a plant that contains genetic material, not found in a 

wild type plant of the same species, variety or cultivar. The genetic material may include a 
transgene, an insertional mutagenesis event (such as by transposon or T-DNA insertional 
mutagenesis), an activation tagging sequence, a mutated sequence, a homologous recombination 
event or a sequence modified by chimeraplasty. Typically, the foreign genetic material has been 

25 introduced into the plant by human manipulation. 

A transgenic plant may contain an expression vector or cassette. The expression cassette 
typically comprises a polypeptide-encoding sequence operably linked (i.e., under regulatory 
control of) to appropriate inducible or constitutive regulatory sequences that allow for the 
expression of polypeptide. The expression cassette can be introduced into a plant by 

30 transformation or by breeding after transformation of a parent plant. A plant refers to a whole 
plant as well as to a plant part, such as seed, fruit, leaf, or root, plant tissue, plant cells or any 
other plant material, e.g., a plant explant, as well as to progeny thereof, and to in vitro systems 
that mimic biochemical or cellular components or processes in a cell. 
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The phrase "ectopically expression or altered expression" in reference to a polynucleotide 
indicates that the pattern of expression in, e.g., a transgenic plant or plant tissue, is different from 
the expression pattern in a wild type plant or a reference plant of the same species. For example, 
the polynucleotide or polypeptide is expressed in a cell or tissue type other than a cell or tissue 
5 type in which the sequence is expressed in the wild type plant, or by expression at a time other 
than at the time the sequence is expressed in the wild type plant, or by a response to different 
inducible agents, such as hormones or environmental signals, or at different expression levels 
(either higher or lower) compared with those found in a wild type plant. The term also refers to 
altered expression patterns that are produced by lowering the levels of expression to below the 

10 detection level or completely abolishing expression. The resulting expression pattern can be 
transient or stable, constitutive or inducible. In reference to a polypeptide, the term "ectopic 
expression or altered expression" further may relate to altered activity levels resulting from the 
interactions of the polypeptides with exogenous or endogenous modulators or from interactions 
with factors or as a result of the chemical modification of the polypeptides. 

15 The term "fragment" or "domain," with respect to a polypeptide, refers to a subsequence 

of the polypeptide. In some cases, the fragment or domain, is a subsequence of the polypeptide 
which performs at least one biological function of the intact polypeptide in substantially the same 
manner, or to a similar extent, as does the intact polypeptide. For example, a polypeptide 
fragment can comprise a recognizable structural motif or functional domain such as a DNA 

20 binding domain that binds to a DNA promoter region, an activation domain or a domain for 

protein-protein interactions. Fragments can vary in size from as few as 6 amino acids to the full 
length of the intact polypeptide, but sure preferably at least about 30 amino acids in length and 
more preferably at least about 60 amino acids in length. In reference to a nucleotide sequence, "a 
fragment" refers to any subsequence of a polynucleotide, typically, of at least consecutive about 

25 15 nucleotides, preferably at least about 30 nucleotides, more preferably at least about 50, of any 
of the sequences provided herein. 

The term "trait" refers to a physiological, morphological, biochemical or physical 
characteristic of a plant or particular plant material or cell. In some instances, this characteristic 
is visible to the human eye, such as seed or plant size, or can be measured by available 

30 biochemical techniques, such as the protein, starch or oil content of seed or leaves or by the 
observation of the expression level of genes, e.g., by employing Northern analysis, RT-PCR, 
microarray gene expression assays or reporter gene expression systems, or by agricultural 
observations such as stress tolerance, yield or pathogen tolerance. 
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"Trait modification" refers to a detectable difference in a characteristic in a plant 
ectopically expressing a polynucleotide or polypeptide of the present invention relative to a plant 
not doing so, such as a wild type plant. In some cases, the trait modification can be evaluated 
quantitatively. For example, the trait modification can entail at least about a 2% increase or 
5 decrease in an observed trait (difference), at least a 5% difference, at least about a 10% 

difference, at least about a 20% difference, at least about a 30%, at least about a 50%, at least 
about a 70%, or at least about a 100%, or an even greater difference. It is known that there can be 
a natural variation in the modified trait. Therefore, the trait modification observed entails a 
change of the normal distribution of the trait in the plants compared with the distribution 

1 0 observed in wild type plant. 

Trait modifications of particular interest include those to seed ( such as embryo or 
endosperm), fruit, root, flower, leaf, stem, shoot, seedling or the like, including: enhanced 
tolerance to environmental conditions including freezing, chilling, heat, drought, water saturation, 
radiation and ozone; improved tolerance to microbial, fungal or viral diseases; improved 

1 5 tolerance to pest infestations, including nematodes, mollicutes, parasitic higher plants or the like; 
decreased herbicide sensitivity; improved tolerance of heavy metals or enhanced ability to take up 
heavy metals; improved growth under poor photoconditions (e.g., low light and/or short day 
length), or changes in expression levels of genes of interest. Other phenotype that can be 
modified relate to the production of plant metabolites, such as variations in the production of 

20 taxol, tocopherol, tocotrienol, sterols, phytosterols, vitamins, wax monomers, anti-oxidants, 
amino acids, lignins, cellulose, tannins, prenyllipids (such as chlorophylls and carotenoids), 
glucosinolates, and terpenoids, enhanced or compositionally altered protein or oil production 
(especially in seeds), or modified sugar (insoluble or soluble) and/or starch composition. 
Physical plant characteristics that can be modified include cell development (such as the number 

25 of trichomes), fruit and seed size and number, yields of plant parts such as stems, leaves and 

roots, the stability of the seeds during storage, characteristics of the seed pod (e.g., susceptibility 
to shattering), root hair length and quantity, internode distances, or the quality of seed coat. Plant 
growth characteristics that can be modified include growth rate, germination rate of seeds, vigor 
of plants and seedlings, leaf and flower senescence, male sterility, apomixis, flowering time, 

30 flower abscission, rate of nitrogen uptake, biomass or transpiration characteristics, as well as 

plant architecture characteristics such as apical dominance, branching patterns, number of organs, 
organ identity, organ shape or size. 
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POLYPEPTIDES AND POLYNUCLEOTIDES OF THE INVENTION 

The present invention provides, among other things, transcription factors (TFs), and 
transcription factor homologue polypeptides, and isolated or recombinant polynucleotides 
encoding the polypeptides. These polypeptides and polynucleotides may be employed to modify 
5 a plant's biochemical characteristic. 

Exemplary polynucleotides encoding the polypeptides of the invention were identified in 
the Arabidopsis thaliana GenBank database using publicly available sequence analysis programs 
and parameters. Sequences initially identified were then further characterized to identify 
sequences comprising specified sequence strings corresponding to sequence motifs present in 
10 families of known transcription factors. Polynucleotide sequences meeting such criteria were 
confirmed as transcription factors. 

Additional polynucleotides of the invention were identified by screening Arabidopsis 
thaliana and/or other plant cDNA libraries with probes corresponding to known transcription 
factors under low stringency hybridization conditions. Additional sequences, including full 
1 5 length coding sequences were subsequently recovered by the rapid amplification of cDNA ends 
(RACE) procedure, using a commercially available kit according to the manufacturer's 
instructions. Where necessary, multiple rounds of RACE are performed to isolate 5 f and 3' ends. 
The full length cDNA was then recovered by a routine end-to-end polymerase chain reaction 
(PCR) using primers specific to the isolated 5' and 3' ends. Exemplary sequences are provided in 

20 the Sequence Listing. 

The polynucleotides of the invention were ectopically expressed in overexpressor or 
knockout plants and changes in the biochemical characteristics of the plants were observed. 
Therefore, the polynucleotides and polypeptides can be employed to improve the biochemical 
characteristics of plants; 

25 Making polynucleotides 

The polynucleotides of the invention include sequences that encode transcription factors 

and transcription factor homologue polypeptides and sequences complementary thereto, as well 
as unique fragments of coding sequence, or sequence complementary thereto. Such 
polynucleotides can be, e.g., DNA or RNA, e.g., mRNA, cRNA, synthetic RNA, genomic DNA, 
30 cDNA synthetic DNA, oligonucleotides, etc. The polynucleotides are either double-stranded or 
single-stranded, and include either, or both sense (i.e., coding) sequences and antisense (i.e., non- 
coding, complementary) sequences. The polynucleotides include the coding sequence of a 
transcription factor, or transcription factor homologue polypeptide, in isolation, in combination 
with additional coding sequences (e.g., a purification tag, a localization signal, as a fusion- 
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protein, as a pre-protein, or the like), in combination with non-coding sequences (e.g., introns or 
inteins, regulatory elements such as promoters, enhancers, terminators, and the like), and/or in a 
vector or host environment in which the polynucleotide encoding a transcription factor or 
transcription factor homologue polypeptide is an endogenous or exogenous gene. 
5 A variety of methods exist for producing the polynucleotides of the invention. Procedures 

for identifying and isolating DNA clones are well known to those of skill in the art, and are 
described in, e.g., Berger and Kimmel, Guide to Molecular Cloning Techn iques. Methods in 
Enzvmology volume 152 Academic Press, Inc., San Diego, CA ("Berger"); Sambrook et al., 
Molecular Cloning - A Laboratory Manual (2nd Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, 
1 0 Cold Spring Harbor, New York, 1 989 ("Sambrook") and Current Prot ocols in Molecular Biology, 
F.M. Ausubel et al., eds., Current Protocols, a joint venture between Greene Publishing 
Associates, Inc. and John Wiley & Sons, Inc., (supplemented through 2000) ("Ausubel"). 

Alternatively, polynucleotides of the invention, can be produced by a variety of in vitro 
amplification methods adapted to the present invention by appropriate selection of specific or 
1 5 degenerate primers. Examples of protocols sufficient to direct persons of skill through in vitro 
amplification methods, including the polymerase chain reaction (PCR) the ligase chain reaction 
(LCR), Qbeta-replicase amplification and other RNA polymerase mediated techniques (e.g., 
NASBA), e.g., for the production of the homologous nucleic acids of the invention are found in 
Berger, Sambrook, and Ausubel, as well as Mullis et al., (1987) PCR Protocols A Guide to 
20 Methods and Applications (Innis et al. eds) Academic Press Inc. San Diego, CA (1990) (Innis). 
Improved methods for cloning in vitro amplified nucleic acids are described in Wallace et al., 
U.S. Pat. No. 5,426,039. Improved methods for amplifying large nucleic acids by PCR are 
summarized in Cheng et al. (1994) Nature 369: 684-685 and the references cited therein, in which 
PCR amplicons of up to 40kb are generated. One of skill will appreciate that essentially any 
25 RNA can be converted into a double stranded DNA suitable for restriction digestion, PCR 
expansion and sequencing using reverse transcriptase and a polymerase. See, e.g., Ausubel, 
Sambrook and Berger, ail supra. 

Alternatively, polynucleotides and oligonucleotides of the invention can be assembled 
from fragments produced by solid-phase synthesis methods. Typically, fragments of up to 
30 approximately 100 bases are individually synthesized and then enzymatically or chemically 
ligated to produce a desired sequence, e.g., a polynucleotide encoding all or part of a 
transcription factor. For example, chemical synthesis using the phosphoramidite method is 
described, e.g., by Beaucage et al. (1981) Tetrahedron Letters 22:1859-69; and Matthes et al. 
(1984) EMBO J. 3:801-5. According to such methods, oligonucleotides are synthesized, purified, 
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annealed to their complementary strand, ligated and then optionally cloned into suitable vectors. 
And if so desired, the polynucleotides and polypeptides of the invention can be custom ordered 
from any of a number of commercial suppliers. 

HOMOLOGOUS SEQUENCES 
5 Sequences homologous, i.e., that share significant sequence identity or similarity, to those 

provided in the Sequence Listing, derived from Arabidopsis thaliana or from other plants of 
choice are also an aspect of the invention. Homologous sequences can be derived from any plant 
including monocots and dicots and in particular agriculturally important plant species, including 
but not limited to, crops such as soybean, wheat, corn, potato, cotton, rice, oilseed rape (including 

10 canola), sunflower, alfalfa, sugarcane and turf; or fruits and vegetables, such as banana, 
blackberry, blueberry, strawberry, and raspberry, cantaloupe, carrot, cauliflower, coffee, 
cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, 
pineapple, spinach, squash, sweet com, tobacco, tomato, watermelon, rosaceous fruits (such as 
apple, peach, pear, cherry and plum) and vegetable brassicas (such as broccoli, cabbage, 

1 5 cauliflower, brussel sprouts and kohlrabi). Other crops, fruits and vegetables whose phenotype 
can be changed include barley, rye, millet, sorghum, currant, avocado, citrus fruits such as 
oranges, lemons, grapefruit and tangerines, artichoke, cherries, nuts such as the walnut and 
peanut, endive, leek, roots, such as arrowroot, beet, cassava, turnip, radish, yam, and sweet 
potato, and beans. The homologous sequences may also be derived from woody species, such 

20 pine, poplar and eucalyptus. 

Transcription factors that are homologous to the listed sequences will typically share at 
least about 30% amino acid sequence identity. More closely related transcription factors can 
share at least about 50%, about 60%, about 65%, about 70%, about 75% or about 80% or about 
90% or about 95% or about 98% or more sequence identity with the listed sequences. Factors 

25 that are most closely related to the listed sequences share, e.g., at least about 85%, about 90% or 
about 95% or more % sequence identity to the listed sequences. At the nucleotide level, the ^ 
sequences will typically share at least about 40% nucleotide sequence identity, preferably at least 
about 50%, about 60%, about 70% or about 80% sequence identity, and more preferably about 
85%, about 90%, about 95% or about 97% or more sequence identity to one or more of the listed 

30 sequences. The degeneracy of the genetic code enables major variations in the nucleotide 

sequence of a polynucleotide while maintaining the amino acid sequence of the encoded protein. 
Conserved domains within a transcription factor family may exhibit a higher degree of sequence 
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homology, such as at least 65% sequence identity including conservative substitutions, and 
preferably at least 80% sequence identity. 

Identifying Nucleic Acids bv Hybridization 
Polynucleotides homologous to the sequences illustrated in the Sequence Listing can be 

5 identified, e.g., by hybridization to each other under stringent or under highly stringent 

conditions. Single stranded polynucleotides hybridize when they associate based on a variety of 
well characterized physico-chemical forces, such as hydrogen bonding, solvent exclusion, base 
stacking and the like. The stringency of a hybridization reflects the degree of sequence identity 
of the nucleic acids involved, such that the higher the stringency, the more similar are the two 

10 polynucleotide strands. Stringency is influenced by a variety of factors, including temperature, 
salt concentration and composition, organic and non-organic additives, solvents, etc. present in 
both the hybridization and wash solutions and incubations (and number), as described in more 
detail in the references cited above. 

An example of stringent hybridization conditions for hybridization of complementary 

15 nucleic acids which have more than 100 complementary residues on a filter in a Southern or 
northern blot is about 5°C to 20°C lower than the thermal melting point (Tm) for the specific 
sequence at a defined ionic strength and pH. The T m is the temperature (under defined ionic 
strength and pH) at which 50% of the target sequence hybridizes to a perfectly matched probe. 
Nucleic acid molecules that hybridize under stringent conditions will typically hybridize to a 

20 probe based on either the entire cDNA or selected portions, e.g., to a unique subsequence, of the 
cDNA under wash conditions of 0.2x SSC to 2.0 x SSC, 0.1% SDS at 50-65° C, for example 0.2 
x SSC, 0.1% SDS at 65° C. For identification of less closely related homologues washes can be 
performed at a lower temperature, e.g., 50° C. In general, stringency is increased by raising the 
wash temperature and/or decreasing the concentration of SSC. 

25 As another example, stringent conditions can be selected such that an oligonucleotide that 

is perfectly complementary to the coding oligonucleotide hybridizes to the coding oligonucleotide 
with at least about a 5-1 Ox higher signal to noise ratio than the ratio for hybridization of the 
perfectly complementary oligonucleotide to a nucleic acid encoding a transcription factor known 
as of the filing date of the application. Conditions can be selected such that a higher signal to 

30 noise ratio is observed in the particular assay which is used, e.g., about 15x, 25x, 35x, 50x or 

more. Accordingly, the subject nucleic acid hybridizes to the unique coding oligonucleotide with 
at least a 2x higher signal to noise ratio as compared to hybridization of the coding 
oligonucleotide to a nucleic acid encoding known polypeptide. Again, higher signal to noise 
ratios can be selected, e.g., about 5x, lOx, 25x, 35x, 50x or more. The particular signal will 
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depend on the label used in the relevant assay, e.g., a fluorescent label, a colorimetric label, a 
radioactive label, or the like. 

Alternatively, transcription factor homologue polypeptides can be obtained by screening 
an expression library using antibodies specific for one or more transcription factors. With the 
5 provision herein of the disclosed transcription factor, and transcription factor homologue nucleic 
acid sequences, the encoded poiypeptide(s) can be expressed and purified in a heterologous 
expression system (e.g., E. coli) and used to raise antibodies (monoclonal or polyclonal) specific 
for the polypeptide(s) in question. Antibodies can also be raised against synthetic peptides 
derived from transcription factor, or transcription factor homologue, amino acid sequences. 
1 0 Methods of raising antibodies are well known in the art and are described in Harlow and Lane 
(1988) Antibodies: A Laboratory Manual , Cold Spring Harbor Laboratory, New York. Such 
antibodies can then be used to screen an expression library produced from the plant from which it 
is desired to clone additional transcription factor homologues, using the methods described above. 
The selected cDNAs can be confirmed by sequencing and enzymatic activity. 

15 SEQUENCE VARIATIONS 

It will readily be appreciated by those of skill in the art, that any of a variety of 
polynucleotide sequences are capable of encoding the transcription factors and transcription 
factor homologue polypeptides of the invention. Due to the degeneracy of the genetic code, 
many different polynucleotides can encode identical and/or substantially similar polypeptides in 

20 addition to those sequences illustrated in the Sequence Listing. 

For example, Table 1 illustrates, e.g., that the codons AGC, AGT, TCA, TCC, TCG, and 
TCT all encode the same amino acid: serine. Accordingly, at each position in the sequence where 
there is a codon encoding serine, any of the above trinucleotide sequences can be used without 
altering the encoded polypeptide. 
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Table 1 



Amino acids 


Codon 


Alanine 


Ala 


A 


GCA 


GCC 


GCG 


GCU 






f'v^tpine 


Cvs 


c 


TGC 


TGT 










A<:narfir arid 

/\jUatliv awlU 


A 


D 


GAC 


GAT 










Olutamic acid 


Glu 


E 


GAA 


GAG 










PlipnvljilflfiitiP 


Phe 


F 


TTC 


TTT 










frlvrinp 


Glv 


G 


GGA 


GGC 


GGG 


GGT 






T-Ti^tidine 


His 


H 


CAC 


CAT 










Isoleucine 


He 


I 


ATA 


ATC 


ATT 








Lysine 


Lys- 


K 


AAA 


AAG 








CTT 


Leucine 


Leu 


L 


TTA 


TTG 


CTA 


CTC 


CTG 


Methionine 


Met 


M 


ATG 












Asparagine 


Asn 


N 


AAC 


AAT 










Proline 


Pro 


P 


CCA 


CCC 


CCG 


CCT 






Glutamine 


Gin 


Q 


CAA 


CAG 










Arginine 


Arg 


R 


AGA 


AGG 


CGA 


CGC 


CGG 


CGT 


Serine 


Ser 


S 


AGC 


AGT 


TCA 


TCC 


TCG 


TCT 


Threonine 


Thr 


T 


ACA 


ACC 


ACG 


ACT 






Valine 


Val 


V 


GTA 


GTC 


GTG 


GTT 






Tryptophan 


Trp 


W 


TGG 












Tyrosine 


Tyr 


Y 


TAC 


TAT 











Sequence alterations that do not change the amino acid sequence encoded by the 
5 polynucleotide are termed "silent" variations. With the exception of the codons ATG and TGG, 
encoding methionine and tryptophan, respectively, any of the possible codons for the same amino 
acid can be substituted by a variety of techniques, e.g., site-directed mutagenesis, available in the 
art. Accordingly, any and all such variations of a sequence selected from the above table are a 
feature of the invention. 

10 In addition to silent variations, other conservative variations that alter one, or a few 

amino acids in the encoded polypeptide, can be made without altering the function of the 
polypeptide, these conservative variants are, likewise, a feature of the invention. 

For example, substitutions, deletions and insertions introduced into the sequences 
provided in the Sequence Listing are also envisioned by the invention. Such sequence 

1 5 modifications can be engineered into a sequence by site-directed mutagenesis (Wu (ed.) Meth, 
Enzvmol . (1993) vol. 217, Academic Press) or the other methods noted below. Amino acid 
substitutions are typically of single residues; insertions usually will be on the order of about from 
1 to 10 amino acid residues; and deletions will range about from 1 to 30 residues. In preferred 
embodiments, deletions or insertions are made in adjacent pairs, e.g., a deletion of two residues or 

20 insertion of two residues. Substitutions, deletions, insertions or any combination thereof can be 
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combined to arrive at a sequence. The mutations that are made in the polynucleotide encoding the 
transcription factor should not place the sequence out of reading frame and should not create 
complementary regions that could produce secondary mRNA structure. Preferably, the 
polypeptide encoded by the DNA performs the desired function. 

Conservative substitutions are those in which at least one residue in the amino acid 
sequence has been removed and a different residue inserted in its place. Such substitutions 
generally are made in accordance with the Table 2 when it is desired to maintain the activity of 
the protein. Table 2 shows amino acids which can be substituted for an amino acid in a protein 
and which are typically regarded as conservative substitutions. 

Table 2 



Residue 


Conservative 




Substitutions 


Ala 


Ser 


Arg 


Lys 


Asn 


Gin; His 


Asp 


Glu 


Gin 


Asn 


Cys 


Ser 


Glu 


Asp 


Gly 


Pro 


His 


Asn; Gin 


ne 


Leu, Val 


Leu 


He; Val 


Lys 


Arg; Gin 


Met 


Leu; lie 


Phe 


Met; Leu; Tyr 


Ser 


Thr; Gly 


Thr 


Ser;Val 


Trp 


Tyr 


Tyr 


Trp; Phe 


Val 


De; Leu 
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Substitutions that are less conservative than those in Table 2 can be selected by picking 
residues that differ more significantly in their effect on maintaining (a) the structure of the 
polypeptide backbone in the area of the substitution, for example, as a sheet or helical 
conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of 
5 the side chain. The substitutions which in general are expected to produce the greatest changes in 
protein properties will be those in which (a) a hydrophilic residue, e.g., seryl or threonyl, is 
substituted for (or by) a hydrophobic residue, e.g., leucyl, isoleucyl, phenylalanyl, valyl or alanyl; 
(b) a cysteine or proline is substituted for (or by) any other residue; (c) a residue having an 
electropositive side chain, e.g., lysyl, arginyl, or histidyl, is substituted for (or by) an 
10 electronegative residue, e.g., glutamyl or aspartyl; or (d) a residue having a bulky side chain, e.g., 
phenylalanine, is substituted for (or by) one not having a side chain, e.g., glycine. 

FURTHER MODIFYING SEQUENCES OF THE INVENTION — MUTATION/ 
FORCED EVOLUTION 

In addition to generating silent or conservative substitutions as noted, above, the present 

1 5 invention optionally includes methods of modifying the sequences of the Sequence Listing. In 

the methods, nucleic acid or protein modification methods are used to alter the given sequences to 
produce new sequences and/or to chemically or enzymatically modify given sequences to change 
the properties of the nucleic acids or proteins. 

Thus, in one embodiment, given nucleic acid sequences are modified, e.g., according to 

20 standard mutagenesis or artificial evolution methods to produce modified sequences. For 

example, Ausubel, supra, provides additional details on mutagenesis methods. Artificial forced 
evolution methods are described, e.g., by Stemmer (1994) Nature 370:389-391, and Stemmer 
(1994) Proc. Natl. Acad. Sci. USA 91:10747-10751. Many other mutation and evolution methods 
are also available and expected to be within the skill of the practitioner. 

25 Similarly, chemical or enzymatic alteration of expressed nucleic acids and polypeptides 

can be performed by standard methods. For example, sequence can be modified by addition of 
lipids, sugars, peptides, organic or inorganic compounds, by the inclusion of modified nucleotides 
or amino acids, or the like. For example, protein modification techniques are illustrated in 
Ausubel, supra. Further details on chemical and enzymatic modifications can be found herein. 

30 These modification methods can be used to modify any given sequence, or to modify any 

sequence produced by the various mutation and artificial evolution modification methods noted 
herein. 

Accordingly, the invention provides for modification of any given nucleic acid by 
mutation, evolution, chemical or enzymatic modification, or other available methods, as well as 
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for the products produced by practicing such methods, e.g., using the sequences herein as a 
starting substrate for the various modification approaches. 

For example, optimized coding sequence containing codons preferred by a particular 
prokaryotic or eukaryotic host can be used e.g., to increase the rate of translation or to produce 
5 recombinant RNA transcripts having desirable properties, such as a longer half-life, as compared 
with transcripts produced using a non-optimized sequence. Translation stop codons can also be 
modified to reflect host preference. For example, preferred stop codons for S. cerevisiae and 
mammals are TAA and TGA, respectively. The preferred stop codon for monocotyledonous 
plants is TGA, whereas insects and E. coli prefer to use TAA as the stop codon. 

1 0 The polynucleotide sequences of the present invention can also be engineered in order to 

alter a coding sequence for a variety of reasons, including but not limited to, alterations which 
modify the sequence to facilitate cloning, processing and/or expression of the gene product. For 
example, alterations are optionally introduced using techniques which are well known in the art, 
e.g., site-directed mutagenesis, to insert new restriction sites, to alter glycosylation patterns, to 

15 change codon preference, to introduce splice sites, etc. 

Furthermore, a fragment or domain derived from any of the polypeptides of the invention 
can be combined with domains derived from other transcription factors or synthetic domains to 
modify the biological activity of a transcription factor. For instance, a DNA binding domain 
derived from a transcription factor of the invention can be combined with the activation domain 

20 of another transcription factor or with a synthetic activation domain. A transcription activation 
domain assists in initiating transcription from a DNA'binding site. Examples include the 
transcription activation region of VP16 or GAL4 (Moore et al. (1998) Proc. Natl. Acad. Sci. USA 
95: 376-381; and Aoyama et al. (1995) Plant Cell 7:1773-1785), peptides derived from bacterial 
sequences (Ma and Ptashne (1987) Cejl 51; 113-1 19) and synthetic peptides (Giniger and 

25 Ptashne, (1987) Nature 330:670-672). 

EXPRESSION AND MODIFICATION OF POLYPEPTIDES 

Typically, polynucleotide sequences of the invention are incorporated into recombinant 
DNA (or RNA) molecules that direct expression of polypeptides of the invention in appropriate 
host cells, transgenic plants, in vitro translation systems, or the like. Due to the inherent 
30 degeneracy of the genetic code, nucleic acid sequences which encode substantially the same or a 
functionally equivalent amino acid sequence can be substituted for any listed sequence to provide 
for cloning and expressing the relevant homologue. 
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Vectors, Promoters and Expression Systems 
The present invention includes recombinant constructs comprising one or more of the 

nucleic acid sequences herein. The constructs typically comprise a vector, such as a plasmid, a . 

cosmid, a phage, a virus (e.g., a plant virus), a bacterial artificial chromosome (BAC), a yeast 

5 artificial chromosome (YAC), or the like, into which a nucleic acid sequence of the invention has 

been inserted, in a forward or reverse orientation. In a preferred aspect of this embodiment, the 

construct further comprises regulatory sequences, including, for example, a promoter, operably 

linked to the sequence. Large numbers of suitable vectors and promoters are known to those of 

skill in the art, and are commercially available. 

1 0 General texts which describe molecular biological techniques useful herein, including the 

use and production of vectors, promoters and many other relevant topics, include Berger, 
Sambrook and Ausubel, supra. Any of the identified sequences can be incorporated into a cassette 
or vector, e.g., for expression in plants. A number of expression vectors suitable for stable 
transformation of plant cells or for the establishment of transgenic plants have been described 

15 including those described in Weissbach and Weissbach, (1989; Methods for Plant Molecular 
Biology , Academic Press, and Gelvin et al., (1990) Plant Molecula r Biology Manual. Kluwer 
Academic Publishers. Specific examples include those derived from a Ti plasmid of 
Agrobacterium tumefaciens, as well as those disclosed by Herrera-Estrella et al. (1983) Nature 
303: 209, Bevan (1984) NucLAcid^ 12: 8711-8721, Klee (1985) Biotechnology 3: 637-642, 

20 for dicotyledonous plants. 

Alternatively, non-Ti vectors can be used to transfer the DNA into monocotyledonous 
plants and cells by using free DNA delivery techniques. Such methods can involve, for example, 
the use of liposomes, electroporation, microprojectile bombardment, silicon carbide whiskers, and 
viruses. By using these methods transgenic plants such as wheat, rice (Christou (1991) 

25 Biotechnology 9: 957-962) and corn (Gordon-Kamm (1990) Plant Cell 2: 603-618) can be 
produced. An immature embryo can also be a good target tissue for monocots for direct DNA 
delivery techniques by using the particle gun (Weeks et al. (1993) Plant Physiol 102: 1077-1084; 
Vasil (1993) Biotechnology 10: 667-674; Wan and Lemeaux (1994) Plant Physiol 104: 37-48, 
and for Agrobacterium-mediated DNA transfer (Ishida et al. (1996) Nature Biotech 14: 745-750). 

30 Typically, plant transformation vectors include one or more cloned plant coding sequence 

(genomic or cDNA) under the transcriptional control of 5' and 3' regulatory sequences and a 
dominant selectable marker. Such plant transformation vectors typically also contain a promoter 
(e.g., a regulatory region controlling inducible or constitutive, environmentally-or 
developmentally-regulated, or cell- or tissue-specific expression), a transcription initiation start 
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site, an RNA processing signal (such as intron splice sites), a transcription termination site, and/or 
a polyadenylation signal. 

Examples of constitutive plant promoters which can be useful for expressing the TF 
sequence include: the cauliflower mosaic virus (CaMV) 35S promoter, which confers 
5 constitutive, high-level expression in most plant tissues {see, e.g., Odel et al. (1985) Nature 
313:810); the nopaline synthase promoter (An et al. (1988) Plant Physiol 88:547); and the 
octopine synthase promoter (Fromm et al. (1989) Plant Cell 1: 977). 

A variety of plant gene promoters that regulate gene expression in response to 
environmental, hormonal, chemical, developmental signals, and in a tissue-active manner can be 

1 0 used for expression of a TF sequence in plants. Choice of a promoter is based largely on the 
phenotype of interest and is determined by such factors as tissue (e.g., seed, fruit, root, pollen, 
vascular tissue, flower, carpel, etc.), inducibility (e.g., in response to wounding, heat, cold, 
drought, light, pathogens, etc.), timing, developmental stage, and the like. Numerous known 
promoters have been characterized and can favorable be employed to promote expression of a 

1 5 polynucleotide of the invention in a transgenic plant or cell of interest. For example, tissue 
specific promoters include: seed-specific promoters (such as the napin, phaseolin or DC3 
promoter described in US Pat. No. 5,773,697), fruit-specific promoters that are active during fruit 
ripening (such as the dru 1 promoter (US Pat. No. 5,783,393), or the 2A1 1 promoter (US Pat. No. 
4,943,674) and the tomato polygalacturonase promoter (Bird et al, (1988) Plant Mol Biol 1 1:65 1), 

20 root-specific promoters, such as those disclosed in US Patent Nos. 5,618,988, 5,837,848 and 

5,905,186, pollen-active promoters such as PTA29, PTA26 and PTA13 (US Pat. No. 5,792,929), 
promoters active in vascular tissue (Ringli and Keller (1998) Plant Mol Biol 37:977-988), flower- 
specific (Kaiser et al, (1995) Plant Mol Biol 28:231-243), pollen (Baerson et al. (1994) Plant Mol 
Biol 26:1947-1959), carpels (Ohl et al. (1990) Plant Cell 2:837-848), pollen and ovules (Baerson 

25 et al. (1993) Plant Mol Biol 22:255-267), auxin-inducible promoters (such as that described in 

van der Kop et al. (1999) Plant Mol Biol 39:979-990 or Baumann et al. (1999) Plant Cell 1 1:323- 
334), cytokinin-inducible promoter (Guevara-Garcia (1998) Plant Mol Biol 38:743-753), 
promoters responsive to gibberellin (Shi et al. (1998) Plant Mol Biol 38: 1053-1060, Willmott et 
al. (1998) 38:817-825) and the like. Additional promoters are those that elicit expression in 

30 response to heat (Ainley et al. (1993) Plant Mol Biol 22: 13-23), light (e.g., the pea rbcS-3A 

promoter, Kuhlemeier et al. (1989) Plant Cell 1:471, and the maize rbcS promoter, Schaffher and 
Sheen (1991) Plant Cell 3: 997); wounding (e.g., wunl, Siebertz et al. (1989) Plant Cell 1: 961); 
pathogens (such as the PR-1 promoter described in Buchel et al. (1999) Plant Mol. Biol. 40:387- 
396, and the PDF1.2 promoter described in Manners et al. (1998) Plant Mol. Biol. 38:1071-80), 
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and chemicals such as methyl jasmonate or salicylic acid (Gatz et al. (1997) Plant Mol Biol 48: 89- 
108). In addition, the timing of the expression can be controlled by using promoters such as those 
acting at senescence (An and Amazon (1995) Science 270: 1986-1988); or late seed development 
(Odell et al. (1994) Plant Physiol 106:447-458). 
5 Plant expression vectors can also include RNA processing signals that can be positioned 

within, upstream or downstream of the coding sequence. In addition, the expression vectors can 
include additional regulatory sequences from the 3'-untranslated region of plant genes, e.g., a 3 ! 
terminator region to increase mRNA stability of the mRNA, such as the PI-II terminator region of 
potato or the octopine or nopaline synthase 3' terminator regions. 

10 Additional Expression Elements 

Specific initiation signals can aid in efficient translation of coding sequences. These 

signals can include, e.g., the ATG initiation codon and adjacent sequences. In cases where a 

coding sequence, its initiation codon and upstream sequences are inserted into the appropriate 

expression vector, no additional translational control signals may be needed. However, in cases 

15 where only coding sequence (e.g., a mature protein coding sequence), or a portion thereof, is 
inserted, exogenous transcriptional control signals including the ATG initiation codon can be 
separately provided. The initiation codon is provided in the correct reading frame to facilitate 
transcription. Exogenous transcriptional elements and initiation codons can be of various origins, 
both natural and synthetic. The efficiency of expression can be enhanced by the inclusion of 

20 enhancers appropriate to the cell system in use. 

Expression Hosts 

The present invention also relates to host cells which are transduced with vectors of the 
invention, and the production of polypeptides of the invention (including fragments thereof) by 
recombinant techniques. Host cells are genetically engineered (i.e, nucleic acids are introduced, 

25 e.g., transduced, transformed or transfected) with the vectors of this invention, which may be, for 
example, a cloning vector or an expression vector comprising the relevant nucleic acids herein. 
The vector is optionally a plasmid, a viral particle, a phage, a naked nucleic acids, etc. The 
engineered host cells can be cultured in conventional nutrient media modified as appropriate for 
activating promoters, selecting transformants, or amplifying the relevant gene. The culture 

30 conditions, such as temperature, pH and the like, are those previously used with the host cell 

selected for expression, and will be apparent to those skilled in the art and in the references cited 
herein, including, Sambrook and Ausubel. 

The host cell can be a eukaryotic cell, such as a yeast cell, or a plant cell, or the host cell 
can be a prokaryotic cell, such as a bacterial cell. Plant protoplasts are also suitable for some 
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applications. For example, the DNA fragments are introduced into plant tissues, cultured plant 
cells or plant protoplasts by standard methods including electroporation (Frornm et al., (1985) 
Proc. Natl. Acad. Sci. USA 82, 5824, infection by viral vectors such as cauliflower mosaic virus 
(CaMV) (Hohn et al., (1982) Molecular Biology of Plant Tumors , (Academic Press, New York) 
5 pp. 549-560; US 4,407,956), high velocity ballistic penetration by small particles with the nucleic 
acid either within the matrix of small beads or particles, or on the surface (Klein et al., (1987) 
Nature 327. 70-73), use of pollen as vector (WO 85/01856), or use of Agrobacterium tumefaciens 
or A. rhizogenes carrying a T-DNA plasmid in which DNA fragments are cloned. The T-DNA 
plasmid is transmitted to plant cells upon infection by Agrobacterium tumefaciens, and a portion 

10 is stably integrated into the plant genome (Horsch et al. (1984) Scienc^233:496-498; Fraley et al. 
(1983) Proc. Natl. Acad. Sci. USA 80, 4803). 

The cell can include a nucleic acid of the invention which encodes a polypeptide, wherein 
the cells expresses a polypeptide of the invention. The cell can also include vector sequences, or 
the like.. Furthermore, cells and transgenic plants which include any polypeptide or nucleic acid 

15 above or throughout this specification, e.g., produced by transduction of a vector of the invention, 
are an additional feature of the invention. 

For long-term, high-yield production of recombinant proteins, stable expression can be 
used. Host cells transformed with a nucleotide sequence encoding a polypeptide of the invention 
are optionally cultured under conditions suitable for the expression and recovery of the encoded 

20 protein from cell culture. The protein or fragment thereof produced by a recombinant cell may be 
secreted, membrane-bound, or contained intracellular^, depending on the sequence and/or the 
vector used. As will be understood by those of skill in the art, expression vectors containing 
polynucleotides encoding mature proteins of the invention can be designed with signal sequences 
which direct secretion of the mature polypeptides through a prokaryotic or eukaryotic cell 

25 membrane. 

Modified Amino Acids 
Polypeptides of the invention may contain one or more modified amino acids. The 

presence of modified amino acids may be advantageous in, for example, increasing polypeptide 

half-life, reducing polypeptide antigenicity or toxicity, increasing polypeptide storage stability, or 

30 the like. Amino acid(s) are modified, for example, co-translationally or post-translationally 

during recombinant production or modified by synthetic or chemical means. 

Non-limiting examples of a modified amino acid include incorporation or other use of 

acetylated amino acids, glycosylated amino acids, sulfated amino acids, prenylated (e.g., 

farnesylated, geranylgeranylated) amino acids, PEG modified (e.g., "PEGylated") amino acids, 

21 



WO 01/36597 



PCT/US00/31344 



biotinylated amino acids, carboxylated amino acids, phosphorylated amino acids, etc. References 
adequate to guide one of skill in the modification of amino acids are replete throughout the 
literature. 

IDENTIFICATION OF ADDITIONAL FACTORS 

A transcription factor provided by the present invention can also be used to identify 
additional endogenous or exogenous molecules that can affect a phentoype or trait of interest. On 
the one hand, such molecules include organic (small or large molecules) and/or inorganic 
compounds that affect expression of (i.e., regulate) a particular transcription factor. 
Alternatively, such molecules include endogenous molecules that are acted upon either at a 
transcriptional level by a transcription factor of the invention to modify a phenotype as desired. 
For example, the transcription factors can be employed to identify one or more downstream gene 
with which is subject to a regulatory effect of the transcription factor. In one approach, a 
transcription factor or transcription factor homologue of the invention is expressed in a host cell, 
e.g, a transgenic plant cell, tissue or explant, and expression products, either RNA or protein, of 
likely or random targets are monitored, e.g., by hybridization to a microarray of nucleic acid 
probes corresponding to genes expressed in a tissue or cell type of interest, by two-dimensional 
gel electrophoresis of protein products, or by any other method known in the art for assessing 
expression of gene products at the level of RNA or protein. Alternatively, a transcription factor 
of the invention can be used to identify promoter sequences (i.e., binding sites) involved in the 
regulation of a downstream target. After identifying a promoter sequence, interactions between 
the transcription factor and the promoter sequence can be modified by changing specific 
nucleotides in the promoter sequence or specific amino acids in the transcription factor that 
interact with the promoter sequence to alter a plant trait. Typically, transcription factor DNA 
binding sites are identified by gel shift assays. After identifying the promoter regions, the 
promoter region sequences can be employed in double-stranded DNA arrays to identify 
molecules that affect the interactions of the transcription factors with their promoters (Bulyk et al. 
(1 QQQ) Nature Biotechnology 17:573-577). 

The identified transcription factors are also useful to identify proteins that modify the 
activity of the transcription factor. Such modification can occur by covalent modification, such 
as by phosphorylation, or by protein-protein (homo or-heteropolymer) interactions. Any method 
suitable for detecting protein-protein interactions can be employed. Among the methods that can 
be employed are co-immunoprecipitation, cross-linking and co-purification through gradients or 
chromatographic columns, and the two-hybrid yeast system. 
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The two-hybrid system detects protein interactions in vivo and is described in Chien, et 
al., (1991), Proc. Natl. Acad. Sci. USA 88, 9578-9582 and is commercially available from 
Clontech (Palo Alto, Calif.). In such a system, plasmids are constructed that encode two hybrid 
proteins: one consists of the DNA-binding domain of a transcription activator protein fused to the 
5 TF polypeptide and the other consists of the transcription activator protein's activation domain 
fused to an unknown protein that is encoded by a cDNA that has been recombined into the 
plasmid as part of a cDNA library. The DNA-binding domain fusion plasmid and the cDNA 
library are transformed into a strain of the yeast Saccharomyces cerevisiae that contains a reporter 
gene (e.g., lacZ) whose regulatory region contains the transcription activator's binding site. Either 

10 hybrid protein alone cannot activate transcription of the reporter gene. Interaction of the two 
hybrid proteins reconstitutes the functional activator protein and results in expression of the 
reporter gene, which is detected by an assay for the reporter gene product. Then, the library 
plasmids responsible for reporter gene expression are isolated and sequenced to identify the 
proteins encoded by the library plasmids. After identifying proteins that interact with the 

1 5 transcription factors, assays for compounds that interfere with the TF protein-protein interactions 
can be preformed. 

IDENTIFICATION OF MODULATORS 

In addition to the intracellular molecules described above, extracellular molecules that 
alter activity or expression of a transcription factor, either directly or indirectly, can be identified. 

20 For example, the methods can entail first placing a candidate molecule in contact with a plant or 
plant cell. The molecule can be introduced by topical administration, such as spraying or soaking 
of a plant, and then the molecule's effect on the expression or activity of the TF polypeptide or 
the expression of the polynucleotide monitored. Changes in the expression of the TF polypeptide 
can be monitored by use of polyclonal or monoclonal antibodies, gel electrophoresis or the like. 

25 Changes in the expression of the corresponding polynucleotide sequence can be detected by use 
of microarrays, Northerns, quantitative PCR, or any other technique for monitoring changes in 
mRNA expression. These techniques are exemplified in Ausubel et al. (eds) Current Protocols in 
Molecular Biology . John Wiley & Sons (1998). Such changes in the expression levels can be 
correlated with modified plant traits and thus identified molecules can be useful for soaking or 

30 spraying on fruit, vegetable and grain crops to modify traits in plants. 

Essentially any available composition can be tested for modulatory activity of expression 
or activity of any nucleic acid or polypeptide herein. Thus, available libraries of compounds such 
as chemicals, polypeptides, nucleic acids and the like can be tested for modulatory activity. 
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Often, potential modulator compounds can be dissolved in aqueous or organic (e.g., DMSO- 
based) solutions for easy delivery to the cell or plant of interest in which the activity of the 
modulator is to be tested. Optionally, the assays are designed to screen large modulator 
composition libraries by automating the assay steps and providing compounds from any 
5 convenient source to assays, which are typically run in parallel (e.g., in microtiter formats on 
microtiter plates in robotic assays). 

In one embodiment, high throughput screening methods involve providing a 
combinatorial library containing a large number of potential compounds (potential modulator 
compounds). Such "combinatorial chemical libraries" are then screened in one or more assays, as 
1 0 described herein, to identify those library members (particular chemical species or subclasses) 
that display a desired characteristic activity. The compounds thus identified can serve as target 
compounds. 

A combinatorial chemical library can be, e.g., a collection of diverse chemical 
compounds generated by chemical synthesis or biological synthesis. For example, a 

1 5 combinatorial chemical library such as a polypeptide library is formed by combining a set of 
chemical building blocks (e.g., in one example, amino acids) in every possible way for a given 
compound length (i.e., the number of amino acids in a polypeptide compound of a set length). 
Exemplary libraries include peptide libraries, nucleic acid libraries, antibody libraries (see, e.g., 
Vaughn et al. (1996) Nature Biotechnology . 14(3):309-314 and PCT/US96/10287), carbohydrate 

20 libraries (see, e.g., Liang et al. Science (1996) 274:1520-1522 and U.S. Patent 5,593,853), 
peptide nucleic acid libraries (see, e.g., U.S. Patent 5,539,083), and small organic molecule 
libraries (see, e.g., benzodiazepines, Baum C&EN Jan 18, page 33 (1993); isoprenoids, U.S. 
Patent 5,569,588; thiazolidinones and metathiazanones, U.S. Patent 5,549,974; pyrrolidines, U.S. 
Patents 5,525,735 and 5,519,134; morpholino compounds, U.S. Patent 5,506,337) and the like. 

25 Preparation and screening of combinatorial or other libraries is well known to those of 

skill in the art. Such combinatorial chemical libraries include, but are not limited to, peptide 
libraries (see, e.g., U.S. Patent 5,010,175, Furka, Int. J. Pent. Prot. Res. 37:487^93 (1991) and 
Houghton et al. Nature 354:84-88 (1991)). Other chemistries for generating chemical diversity 

libraries can also be used. 
30 In addition, as noted, compound screening equipment for high-throughput screening is 

generally available, e.g., using any of a number of well known robotic systems that have also 
been developed for solution phase chemistries useful in assay systems. These systems include 
automated workstations including an automated synthesis apparatus and robotic systems utilizing 
robotic arms. Any of the above devices are suitable for use with the present invention, e.g., for 
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high-throughput screening of potential modulators. The nature and implementation of 
modifications to these devices (if any) so that they can operate as discussed herein will be 
apparent to persons skilled in the relevant art. 

Indeed, entire high throughput screening systems are commercially available. These 
5 systems typically automate entire procedures including all sample and reagent pipetting, liquid 
dispensing, timed incubations, and final readings of the microplate in detector(s) appropriate for 
the assay. These configurable systems provide high throughput and rapid start up as well as a 
high degree of flexibility and customization. Similarly, microfluidic implementations of 
screening are also commercially available. 

1 0 The manufacturers of such systems provide detailed protocols the various high 

throughput. Thus, for example, Zymark Corp. provides technical bulletins describing screening 
systems for detecting the modulation of gene transcription, ligand binding, and the like. The 
integrated systems herein, in addition to providing for sequence alignment and, optionally, 
synthesis of relevant nucleic acids, can include such screening apparatus to identify modulators 

15 that have an effect on one or more polynucleotides or polypeptides according to the present 
invention. 

In some assays it is desirable to have positive controls to ensure that the components of 
the assays are working properly. At least two types of positive controls are appropriate. That is, 
known transcriptional activators or inhibitors can be incubated with cells/plants/ etc. in one 

20 sample of the assay, and the resulting increase/decrease in transcription can be detected by 
measuring the resulting increase in RNA/ protein expression, etc., according to the methods 
herein. It will be appreciated that modulators can also be combined with transcriptional 
activators or inhibitors to find modulators which inhibit transcriptional activation or 
transcriptional repression. Either expression of the nucleic acids and proteins herein or any 

25 additional nucleic acids or proteins activated by the nucleic acids or proteins herein, or both, can 
be monitored. 

In an embodiment, the invention provides a method for identifying compositions that 
modulate the activity or expression of a polynucleotide or polypeptide of the invention. For 
example, a test compound, whether a small or large molecule, is placed in contact with a cell, 
30 plant (or plant tissue or explant), or composition comprising the polynucleotide or polypeptide of 
interest and a resulting effect on the cell, plant, (or tissue or explant) or composition is evaluated 
by monitoring, either directly or indirectly, one or more of: expression level of the polynucleotide 
or polypeptide, activity (or modulation of the activity) of the polynucleotide or polypeptide. In 
some cases, an alteration in a plant phenotype can be detected following contact of a plant (or 
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plant cell, or tissue or explant) with the putative modulator, e.g., by modulation of expression or 
activity of a polynucleotide or polypeptide of the invention. 

SUBSEQUENCES 

5 Also contemplated are uses of polynucleotides, also referred to herein as 

oligonucleotides, typically having at least 12 bases, preferably at least 15, more preferably at least 
20, 30, or 50 bases, which hybridize under at least highly stringent (or ultra-high stringent or 
ultra-ultra- high stringent conditions) conditions to a polynucleotide sequence described above. 
The polynucleotides may be used as probes, primers, sense and antisense agents, and the like, 

1 0 according to methods as noted supra. 

Subsequences of the polynucleotides of the invention, including polynucleotide 
fragments and oligonucleotides are useful as nucleic acid probes and primers. An oligonucleotide 
suitable for use as a probe or primer is at least about 15 nucleotides in length, more often at least 
about 18 nucleotides, often at least about 21 nucleotides, frequently at least about 30 nucleotides, 

15 or about 40 nucleotides, or more in length. A nucleic acid probe is useful in hybridization 
protocols, e.g., to identify additional polypeptide homologues of the invention, including 
protocols for microarray experiments. Primers can be annealed to a complementary target DNA 
strand by nucleic acid hybridization to form a hybrid between the primer and the target DNA 
strand, and then extended along the target DNA strand by a DNA polymerase enzyme. Primer 

20 pairs can be used for amplification of a nucleic acid sequence, e.g., by the polymerase chain 

reaction (PCR) or other nucleic-acid amplification methods. See Sambrook and Ausubel, supra. 

In addition, the invention includes an isolated or recombinant polypeptide including a 
subsequence of at least about 15 contiguous amino acids encoded by the recombinant or isolated 
polynucleotides of the invention. For example, such polypeptides, or domains or fragments 

25 thereof, can be used as immunogens, e.g., to produce antibodies specific for the polypeptide 

sequence, or as probes for detecting a sequence of interest. A subsequence can range in size from 
about 15 amino acids in length up to and including the full length of the polypeptide. 

PRODUCTION OF TRANSGENIC PLANTS 
Modification of Traits 

30 The polynucleotides of the invention are favorably employed to produce transgenic plants 

with various traits, or characteristics, that have been modified in a desirable manner, e.g., to 
improve the seed characteristics of a plant. For example, alteration of expression levels or 
patterns (e.g., spatial or temporal expression patterns) of one or more of the transcription factors 
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(or transcription factor homologues) of the invention, as compared with the levels of the same 
protein found in a wild type plant, can be used to modify a plant's traits. An illustrative example 
of trait modification, improved biochemical characteristics, by altering expression levels of a 
particular transcription factor is described further in the Examples and the Sequence Listing. 

Antisense and Cosuppression Approaches 
In addition to expression of the nucleic acids of the invention as gene replacement or 

plant phenotype modification nucleic acids, the nucleic acids are also useful for sense and anti- 
sense suppression of expression, e.g., to down-regulate expression of a nucleic acid of the 
invention, e.g., as a further mechanism for modulating plant phenotype. That is, the nucleic acids 
of the invention, or subsequences or anti-sense sequences thereof, can be used to block expression 
of naturally occuiring homologous nucleic acids. A variety of sense and anti-sense technologies 
are known in the art, e.g., as set forth in Lichtenstein and Nellen (1997) Antisense Technology: A 
Practical Approach IRL Press at Oxford University, Oxford, England. In general, sense or anti- 
sense sequences are introduced into a cell, where they are optionally amplified, e.g., by 
transcription. Such sequences include both simple oligonucleotide sequences and catalytic 
sequences such as ribozymes. 

For example, a reduction or elimination of expression (i.e., a "knock-out") of a 
transcription factor or transcription factor homologue polypeptide in a transgenic plant, e.g., to 
modify a plant trait, can be obtained by introducing an antisense construct corresponding to the 
polypeptide of interest as a cDNA. For antisense suppression, the transcription factor or homologue 
cDNA is arranged in reverse orientation (with respect to the coding sequence) relative to the 
promoter sequence in the expression vector. The introduced sequence need not be the full length 
cDNA or gene, and need not be identical to the cDNA or gene found in the plant type to be 
transformed. Typically, the antisense sequence need only be capable of hybridizing to the target 
gene or RNA of interest. Thus, where the introduced sequence is of shorter length, a higher 
degree of homology to the endogenous transcription factor sequence will be needed for effective 
antisense suppression. While antisense sequences of various lengths can be utilized, preferably, 
the introduced antisense sequence in the vector will be at least 30 nucleotides in length, and 
improved antisense suppression will typically be observed as the length of the antisense sequence 
increases. Preferably, the length of the antisense sequence in the vector will be greater than 100 
nucleotides. Transcription of an antisense construct as described results in the production of 
RNA molecules that are the reverse complement of mRNA molecules transcribed from the 
endogenous transcription factor gene in the plant cell. 
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Suppression of endogenous transcription factor gene expression can also be achieved 
using a ribozyme. Ribozymes are RNA molecules that possess highly specific endoribonuciease 
activity. The production and use of ribozymes are disclosed in U.S. Patent No. 4,987,071 and 
U.S. Patent No, 5,543,508. Synthetic ribozyme sequences including antisense RNAs can be used 
5 to confer RNA cleaving activity on the antisense RNA, such that endogenous mRNA molecules 
that hybridize to the antisense RNA are cleaved, which in turn leads to an enhanced antisense 
inhibition of endogenous gene expression. 

Vectors in which RNA encoded by a transcription factor or transcription factor 
homologue cDNA is over-expressed can also be used to obtain co-suppression of a corresponding 

10 endogenous gene, e.g., in the manner described in U.S. Patent No. 5,23 1,020 to Jorgensen. Such 
co-suppression (also termed sense suppression) does not require that the entire transcription factor 
cDNA be introduced into the plant cells, nor does it require that the introduced sequence be 
exactly identical to the endogenous transcription factor gene of interest. However, as with 
antisense suppression, the suppressive efficiency will be enhanced as specificity of hybridization 

15 is increased, e.g., as the introduced sequence is lengthened, and/or as the sequence similarity 
between the introduced sequence and the endogenous transcription factor gene is increased. 

Vectors expressing an untranslatable form of the transcription factor mRNA, e.g., 
sequences comprising one or more stop codon, or nonsense mutation) can also be used to 
suppress expression of an endogenous transcription factor, thereby reducing or eliminating it's 

20 activity and modifying one or more traits. Methods for producing such constructs are described 
in U.S. Patent No. 5,583,021 . Preferably, such constructs are made by introducing a premature 
stop codon into the transcription factor gene. Alternatively, a plant trait can be modified by gene 
silencing using double-strand RNA (Sharp (1999) Genes and Development 13: 139-141). 

Another method for abolishing the expression of a gene is by insertion mutagenesis using 

25 the T-DNA of Agrobacterium tumefaciens. After generating the insertion mutants, the mutants 
can be screened to identify those containing the insertion in a transcription factor or transcription 
factor homologue gene. Plants containing a single transgene insertion event at the desired gene 
can be crossed to generate homozygous plants for the mutation (Koncz et al. (1992) Methods in 
Arabidopsis Research, World Scientific). 

30 Alternatively, a plant phenotype can be altered by eliminating an endogenous gene, such 

as a transcription factor or transcription factor homologue, e.g., by homologous recombination 
(Kempin et al. (1997) Nature 389:802). 

A plant trait can also be modified by using the cre-lox system (for example, as described 
in US Pat. No. 5,658,772). A plant genome can be modified to include first and second lox sites 
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that are then contacted with a Cre recombinase. If the lox sites are in the same orientation, the 
intervening DNA sequence between the two sites is excised. If the lox sites are in the opposite 
orientation, the intervening sequence is inverted. 

The polynucleotides and polypeptides of this invention can also be expressed in a plant in 
5 the absence of an expression cassette by manipulating the activity or expression level of the 
endogenous gene by other means. For example, by ectopically expressing a gene by T-DNA 
activation tagging (Ichikawa et al. (1997) Nature 390 698-701; Kakimoto et al. (1996) Science 
274: 982-985). This method entails transforming a plant with a gene tag containing multiple 
transcriptional enhancers and once the tag has inserted into the genome, expression of a flanking 
10 gene coding sequence becomes deregulated. In another example, the transcriptional machinery in 
a plant can be modified so as to increase transcription levels of a polynucleotide of the invention 
{See, e.g., PCT Publications WO 96/06166 and WO 98/53057 which describe the modification of 
the DNA binding specificity of zinc finger proteins by changing particular amino acids in the 
DNA binding motif). 

15 The transgenic plant can also include the machinery necessary for expressing or altering 

the activity of a polypeptide encoded by an endogenous gene, for example by altering the 
phosphorylation state of the polypeptide to maintain it in an activated state. 

Transgenic plants (or plant cells, or plant explants, or plant tissues) incorporating the 
polynucleotides of the invention and/or expressing the polypeptides of the invention can be 

20 produced by a variety of well established techniques as described above. Following construction 
of a vector, most typically an expression cassette, including a polynucleotide, e.g., encoding a 
transcription factor or transcription factor homologue, of the invention, standard techniques can 
be used to introduce the polynucleotide into a plant, a plant cell, a plant explant or a plant tissue 
of interest. Optionally, the plant cell, explant or tissue can be regenerated to produce a transgenic 

25 plant. 

The plant can be any higher plant, including gymnosperms, monocotyledonous and 
dicotyledenous plants. Suitable protocols are available for Leguminosae (alfalfa, soybean, clover, 
etc.), Umbelliferae (carrot, celery, parsnip), Cruciferae (cabbage, radish, rapeseed, broccoli, etc.), 
Curcurbitaceae (melons and cucumber), Gramineae (wheat, corn, rice, barley, millet, etc.), 
30 Solanaceae (potato, tomato, tobacco, peppers, etc.), and various other crops. See protocols 

described in Ammirato et al. (1984) Handbook of Plant Cell Culture -Crop Species . Macmillan 
Publ. Co. Shimamoto et al. (1989) Nature 338:274-276; Fromm et al. (1990) Bio/Technology 
8:833-839; and Vasil et al. (1990) Bio/Technology 8:429-434. 
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Transformation and regeneration of both monocotyledonous and dicotyledonous plant 
cells is now routine, and the selection of the most appropriate transformation technique will be 
determined by the practitioner. The choice of method will vary with the type of plant to be 
transformed; those skilled in the art will recognize the suitability of particular methods for given 
5 plant types. Suitable methods can include, but are not limited to: electroporation of plant 
protoplasts; liposome-mediated transformation; polyethylene glycol (PEG) mediated 
transformation; transformation using viruses; micro-injection of plant cells; micro-projectile 
bombardment of plant cells; vacuum infiltration; and Agrobacteriwn tumeficiens mediated 
transformation. Transformation means introducing a nucleotide sequence in a plant in a manner to 

1 0 cause stable or transient expression of the sequence. 

Successful examples of the modification of plant characteristics by transformation with 
cloned sequences which serve to illustrate the current knowledge in this field of technology, and 
which are herein incorporated by reference, include: U.S. Patent Nos. 5,571,706; 5,677,175; 
5,510,471; 5,750,386; 5,597,945; 5,589,615; 5,750,871; 5,268,526; 5,780,708; 5,538,880; 

15 5,773,269; 5,736,369 and 5,610,042. 

Following transformation, plants are preferably selected using a dominant selectable 
marker incorporated into the transformation vector. Typically, such a marker will confer 
antibiotic or herbicide resistance on the transformed plants, and selection of transformants can be 
accomplished by exposing the plants to appropriate concentrations of the antibiotic or herbicide. 

20 After transformed plants are selected and grown to maturity, those plants showing a 

modified trait are identified. The modified trait can be any of those traits described above. 
Additionally, to confirm that the modified trait is due to changes in expression levels or activity 
of the polypeptide or polynucleotide of the invention can be determined by analyzing mRNA 
expression using Northern blots, RT-PCR or microarrays, or protein expression using 

25 immunoblots or Western blots or gel shift assays. 

INTEGRATED SYSTEMS — SEQUENCE IDENTITY 

Additionally, the present invention may be an integrated system, computer or computer 
readable medium that comprises an instruction set for determining the identity of one or more 
sequences in a database. In addition, the instruction set can be used to generate or identify 
30 sequences that meet any specified criteria. Furthermore, the instruction set may be used to 

associate or link certain functional benefits, such improved biochemical characteristics, with one 
or more identified sequence. 
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For example, the instruction set can include, e.g., a sequence comparison or other 
alignment program, e.g., an available program such as, for example, the Wisconsin Package 
Version 10.0, such as BLAST, FASTA, PILEUP, F1NDPATTERNS or the like (GCG, Madision, 
WI). Public sequence databases such as GenBank, EMBL, Swiss-Prot and PER. or private 
5 sequence databases such as PhytoSeq (Incyte Pharmaceuticals, Palo Alto, CA) can be searched. 
Alignment of sequences for comparison can be conducted by the local homology 
algorithm of Smith and Waterman (1981) Adv. Appl. Math. 2:482, by the homology alignment 
algorithm of Needleman and Wunsch (1970) J. Mol. Biol. 48:443, by the search for similarity 
method of Pearson and Liprnan (1988) Proc. Natl. Acad. Sci. U.S.A . 85: 2444, by computerized 

10 implementations of these algorithms. After alignment, sequence comparisons between two (or 
more) polynucleotides or polypeptides are typically performed by comparing sequences of the 
two sequences over a comparison window to identify and compare local regions of sequence 
similarity. The comparison window can be a segment of at least about 20 contiguous positions, 
usually about 50 to about 200, more usually about 100 to about 150 contiguous positions. A 

1 5 description of the method is provided in Ausubel et al., supra. 

A variety of methods of determining sequence relationships can be used, including 
manual alignment and computer assisted sequence alignment and analysis. This later approach is 
a preferred approach in the present invention, due to the increased throughput afforded by 
computer assisted methods. As noted above, a variety of computer programs for performing 

20 sequence alignment are available, or can be produced by one of skill. 

One example algorithm that is suitable for determining percent sequence identity and 
sequence similarity is the BLAST algorithm, which is described in Altschul et al. J. Mol. Biol 
215:403-410 (1990). Software for performing BLAST analyses is publicly available, e.g., 
through the National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/). This 

25 algorithm involves first identifying high scoring sequence pairs (HSPs) by identifying short 
words of length W in the query sequence, which either match or satisfy some positive-valued 
threshold score T when aligned with a word of the same length in a database sequence. T is 
referred to as the neighborhood word score threshold (Altschul et al., supra). These initial 
neighborhood word hits act as seeds for initiating searches to find longer HSPs containing them. 

30 The word hits are then extended in both directions along each sequence for as far as the 
cumulative alignment score can be increased. Cumulative scores are calculated using, for 
nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 
0) and N (penalty score for mismatching residues; always < 0). For amino acid sequences, a 
scoring matrix is used to calculate the cumulative score. Extension of the word hits in each 
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direction are halted when: the cumulative alignment score falls off by the quantity X from its 
maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of 
one or more negative-scoring residue alignments; or the end of either sequence is reached. The 
BLAST algorithm parameters W, T, and X determine the sensitivity and speed of the alignment. 
The BLASTN program (for nucleotide sequences) uses as defaults a wordlength (W) of 1 1, an 
expectation (E) of 10, a cutoff of 100, M=5, N=-4, and a comparison of both strands. For amino 
acid sequences, the BLASTP program uses as defaults a wordlength (W) of 3, an expectation (E) 
of 10, and the BLOSUM62 scoring matrix (see Henikoff & Henikoff H989^ Proc. Natl. Acad. 
Sci. USA 89:10915). 

In addition to calculating percent sequence identity, the BLAST algorithm also performs 
a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul (1993) 
Proc. Natl. Acad. Sci. USA 90:5873-5787). One measure of similarity provided by the BLAST 
algorithm is the smallest sum probability (P(N)), which provides an indication of the probability 
by which a match between two nucleotide or amino acid sequences would occur by chance. For 
example, a nucleic acid is considered similar to a reference sequence (and, therefore, in this 
context, homologous) if the smallest sum probability in a comparison of the test nucleic acid to 
the reference nucleic acid is less than about 0.1, or less than about 0.01, and or even less than 
about 0.001. An additional example of a useful sequence alignment algorithm is PILEUP. 
PILEUP creates a multiple sequence alignment from a group of related sequences using 
progressive, pairwise alignments. The program can align, e.g., up to 300 sequences of a 
maximum length of 5,000 letters. 

The integrated system, or computer typically includes a user input interface allowing a 
user to selectively view one or more sequence records corresponding to the one or more character 
strings, as well as an instruction set which aligns the one or more character strings with each other 
or with an additional character string to identify one or more region of sequence similarity. The 
system may include a link of one or more character strings with a particular phenotype or gene 
function. Typically, the system includes a user readable output element which displays an 
alignment produced by the alignment instruction set. 

The methods of this invention can be implemented in a localized or distributed 
computing environment. In a distributed environment, the methods may implemented on a single 
computer comprising multiple processors or on a multiplicity of computers. The computers can 
be linked, e.g. through a common bus, but more preferably the computers) are nodes on a 
network. The network can be a generalized or a dedicated local or wide-area network and, in 
certain preferred embodiments, the computers may be components of an intra-net or an internet. 
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Thus, the invention provides methods for identifying a sequence similar or homologous 
to one or more polynucleotides as noted herein, or one or more target polypeptides encoded by 
the polynucleotides, or otherwise noted herein and may include linking or associating a given 
plant phenotype or gene function with a sequence. In the methods, a sequence database is 
5 provided (locally or across an inter or intra net) and a query is made against the sequence 

database using the relevant sequences herein and associated plant phenotypes or gene functions. 

Any sequence herein can be entered into the database, before or after querying the 
database. This provides for both expansion of the database and, if done before the querying step, 
for insertion of control sequences into the database. The control sequences can be detected by the 
10 query to ensure the general integrity of both the database and the query. As noted, the query can 
be performed using a web browser based interface. For example, the database can be a 
centralized public database such as those noted herein, and the querying can be done from a 
remote terminal or computer across an internet or intranet. 

EXAMPLES 

1 5 The following examples are intended to illustrate but not limit the present invention. 

EXAMPLE I. FULL LENGTH GENE IDENTIFICATION A ND CLONING 
Putative transcription factor sequences (genomic or ESTs) related to known transcription 
factors were identified in the Arabidopsis ihaliana GenBank database using the tblastn sequence 
analysis program using default parameters and a P-value cutoff threshold of -4 or -5 or lower, 
20 depending on the length of the query sequence. Putative transcription factor sequence hits were 
then screened to identify those containing particular sequence strings. If the sequence hits 
contained such sequence strings, the sequences were confirmed as transcription factors. 

Alternatively, Arabidopsis thaliana cDNA libraries derived from different tissues or 
treatments, or genomic libraries were screened to identify novel members of a transcription 
25 family using a low stringency hybridization approach. Probes were synthesized using gene 

* 32 

specific primers in a standard PCR reaction (annealing temperature 60° C) and labeled with P 
dCTP using the High Prime DNA Labeling Kit (Boehringer Mannheim). Purified radiolabelled 
probes were added to filters immersed in Church hybridization medium (0.5 M NaP0 4 pH 7.0, 
7% SDS, 1 % w/v bovine serum albumin) and hybridized overnight at 60 °C with shaking. Filters 
30 were washed two times for 45 to 60 minutes with lxSCC, 1% SDS at 60° C. 

To identify additional sequence 5' or 3* of a partial cDNA sequence in a cDNA library, 5* 
and 3' rapid amplification of cDNA ends (RACE) was performed using the Marathon™ cDNA 
amplification kit (Clontech, Palo Alto, CA). Generally, the method entailed first isolating 
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poly(A) mRNA, performing first and second strand cDNA synthesis to generate double stranded 
cDNA, blunting cDNA ends, followed by ligation of the Marathon™ Adaptor to the cDNA to 
form a library of adaptor-ligated ds cDNA. 

Gene-specific primers were designed to be used along with adaptor specific primers for 
5 both 5' and 3' RACE reactions. Nested primers, rather than single primers, were used to increase 
PCR specificity. Using 5* and 3' RACE reactions, 5' and 3' RACE fragments were obtained, 
sequenced and cloned. The process can be repeated until 5' and 3' ends of the full-length gene 
were identified. Then the full-length cDNA was generated by PCR using primers specific to 5' 
and 3' ends of the gene by end-to-end PCR. 

10 EXAMPLE n. CONSTRUCTION OF EXPRESSION VECTORS 

The sequence was amplified from a genomic or cDNA library using primers specific to 
sequences upstream and downstream of the coding region. The expression vector was pMEN20 
or pMEN65, which are both derived from pMON316 (Sanders et al, (1987) Nucleic Acids 
Research 15: 1 543-58) and contain the CaMV 35S promoter to express transgenes. To clone the 

1 5 sequence into the vector, both pMEN20 and the amplified DNA fragment were digested 

separately with Sail and NotI restriction enzymes at 37° C for 2 hours. The digestion products 
were subject to electrophoresis in a 0.8% agarose gel and visualized by ethidium bromide 
staining. The DNA fragments containing the sequence and the linearized plasmid were excised 
and purified by using a Qiaquick gel extraction kit (Qiagen, CA). The fragments of interest were 

20 ligated at a ratio of 3:1 (vector to insert). Ligation reactions using T4 DNA ligase (New England 
Biolabs, MA) were carried out at 16° C for 16 hours. The ligated DNAs were transformed into 
competent cells of the E, coli strain DH5alpha by using the heat shock method. The 
transformations were plated on LB plates containing 50 mg/1 kanamycin (Sigma). 

Individual colonies were grown overnight in five milliliters of LB broth containing 50 

25 mg/1 kanamycin at 37° C. Plasmid DNA was purified by using Qiaquick Mini Prep kits (Qiagen, 
CA). 

EXAMPLE III- TRANSFORMATION OF AGROBACTERIUM WHYi THE 
EXPRESSION VECTOR 

After the plasmid vector containing the gene was constructed, the vector was used to 
30 transform Agrobacterium tumefaciens cells expressing the gene products. The stock of 

Agrobacterium tumefaciens cells for transformation were made as described by Nagel et al. 
(1990) FFMS Microbiol Letts . 67: 325-328. Agrobacterium strain ABI was grown in 250 ml LB 
medium (Sigma) overnight at 28°C with shaking until an absorbance (A^oo) of 0.5 - 1.0 was 
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reached. Cells were harvested by centrifugation at 4,000 x g for 15 min at 4°C. Cells were then 
resuspended in 250 ul chilled buffer (1 mM HEPES, pH adjusted to 7.0 with KOH). Cells were 
centrifuged again as described above and resuspended in 125 ul chilled buffer. Cells were then 
centrifuged and resuspended two more times in the same HEPES buffer as described above at a 
5 volume of 100 ul and 750 ul, respectively. Resuspended cells were then distributed into 40 ul 
aliquots, quickly frozen in liquid nitrogen, and stored at -80° C. 

Agrobacterium cells were transformed with plasmids prepared as described above 
following the protocol described by Nagel et al. For each DNA construct to be transformed, 50 - 
100 ng DNA (generally resuspended in 10 mM Tris-HCl, 1 mM EDTA, pH 8.0) was mixed with 

10 40 ul of Agrobacterium cells; The DNA/cell mixture was then transferred to a chilled cuvette 
with a 2mm electrode gap and subject to a 2.5 kV charge dissipated at 25 uF and 200 uF using a 
Gene Pulser II apparatus (Bio-Rad). After electroporation, cells were immediately resuspended 
in 1 .0 ml LB and allowed to recover without antibiotic selection for 2 - 4 hours at 28° C in a 
shaking incubator. After recovery, cells were plated onto selective medium of LB broth 

15 containing 100 ug/ml spectinomycin (Sigma) and incubated for 24-48 hours at 28° C. Single 

colonies were then picked and inoculated in fresh medium. The presence of the plasmid construct 
was verified by PCR amplification and sequence analysis. 

EXAMPLE IV. TRANSFORMATION OF ARABIDOPSIS PLANTS WITH 
AGROBACTERIUM TUMEFACIENS WITH EXPRESSION VECTOR 

20 After transformation of Agrobacterium tumefaciens with plasmid vectors containing the 

gene, single Agrobacterium colonies were identified, propagated, and used to transform 
Arabidopsis plants. Briefly, 500 ml cultures of LB medium containing 50 mg/1 kanamycin were 
inoculated with the colonies and grown at 28° C with shaking for 2 days until an absorbance 
(A<>oo) of > 2.0 is reached. Cells were then harvested by centrifugation at 4,000 x g for 10 min, 

25 and resuspended in infiltration medium (1/2 X Murashige and Skoog salts (Sigma), 1 X 

Gamborg's B-5 vitamins (Sigma), 5.0% (w/v) sucrose (Sigma), 0.044 uM benzylamino purine 
(Sigma), 200 ul/L Silwet L-77 (Lehle Seeds) until an absorbance (A*oo) of 0.8 was reached. 

Prior to transformation, Arabidopsis thaliana seeds (ecotype Columbia) were sown at a 
density of -10 plants per 4" pot onto Pro-Mix BX potting medium (Hummert International) 

30 covered with fiberglass mesh (1 8 mm X 16 mm). Plants were grown under continuous 

illumination (50-75 uE/m 2 /sec) at 22-23° C with 65-70% relative humidity. After about 4 weeks, 
primary inflorescence stems (bolts) are cut off to encourage growth of multiple secondary bolts. 
After flowering of the mature secondary bolts, plants were prepared for transformation by 
removal of all siliques and opened flowers. 
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The pots were then immersed upside down in the mixture of Agrobacterium infiltration 
medium as described above for 30 sec, and placed on their sides to allow draining into a 1* x T 
flat surface covered with plastic wrap. After 24 h, the plastic wrap was removed and pots are 
turned upright. The immersion procedure was repeated one week later, for a total of two 
5 immersions per pot. Seeds were then collected from each transformation pot and analyzed 
following the protocol described below. 

EXAMPLE V. IDENTIFICATION OF ARABIDOPSIS PRIMARY 
TRANSFORMANTS 

Seeds collected from the transformation pots were sterilized essentially as follows. Seeds 

10 were dispersed into in a solution containing 0.1% (v/v) Triton X-100 (Sigma) and sterile H 2 0 and 
washed by shaking the suspension for 20 min. The wash solution was then drained and replaced 
with fresh wash solution to wash the seeds for 20 min with shaking. After removal of the second 
wash solution, a solution containing 0.1% (v/v) Triton X-100 and 70% ethanol (Equistar) was 
added to the seeds and the suspension was shaken for 5 min. After removal of the 

15 ethanol/detergent solution, a solution containing 0.1% (v/v) Triton X-100 and 30% (v/v) bleach 
(Clorox) was added to the seeds, and the suspension was shaken for 10 min. After removal of the 
bleach/detergent solution, seeds were then washed five times in sterile distilled H 2 0. The seeds 
were stored in the last wash water at 4° C for 2 days in the dark before being plated onto antibiotic 
selection medium (1 X Murashige and Skoog salts (pH adjusted to 5.7 with 1M KOH), 1 X 

20 Gamborg's B-5 vitamins, 0.9% phytagar (Life Technologies), and 50 mg/1 kanamycin). Seeds 
were germinated under continuous illumination (50-75 uE/m 2 /sec) at 22-23° C. After 7-10 days 
of growth under these conditions, kanamycin resistant primary transformants (T! generation) 
were visible and obtained. These seedlings were transferred first to fresh selection plates where 
the seedlings continued to grow for 3-5 more days, and then to soil (Pro-Mix BX potting 

25 medium). 

Primary transformants were crossed and progeny seeds (T 2 ) collected; kanamycin 
resistant seedlings were selected and analyzed. The expression levels of the recombinant 
polynucleotides in the transformants varies from about a 5% expression level increase to a least a 
100% expression level increase. Similar observations are made with respect to polypeptide level 
30 expression. 
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EXAMPLE VI. IDENTIFICATION OF ARABIDOPSIS PLANTS WITH 
TRANSCRIPTION FACTOR GENE KNOCKOUTS 

The screening of insertion mutagenized Arabidopsis collections for null mutants in a 

known target gene was essentially as described in Krysan et al (1999) Plant Cell 1 1:2283-2290. 

5 Briefly, gene-specific primers, nested by 5-250 base pairs to each other, were designed from the 

5' and 3' regions of a known target gene. Similarly, nested sets of primers were also created 

specific to each of the T-DNA or transposon ends (the "right" and "left" borders). All possible 

combinations of gene specific and T-DNA/transposon primers were used to detect by PCR an 

insertion event within or close to the target gene. The amplified DNA fragments were then 

1 0 sequenced which allows the precise determination of the T-DNA/transposon insertion point 

relative to the target gene. Insertion events within the coding or intervening sequence of the 

genes were deconvoluted from a pool comprising a plurality of insertion events to a single unique 

mutant plant for functional characterization. The method is described in more detail in Yu and 

Adam, US Application Serial No. 09/177,733 filed October 23, 1998. 

15 EXAMPLE VII. IDENTIFICATION OF MODIFIED BIOCHEMICAL 

CHARACTERISTICS PHENOTYPE IN OVEREXPRESSOR OR GENE KNOCKOUT 
PLANTS 

Experiments were performed to identify those transformants or knockouts that exhibited 
modified biochemical characteristics. Among the biochemicals that were assayed were insoluble 

20 sugars, such as arabinose, fucose, galactose, mannose, rhamnose or xylose or the like; prenyl 

lipids, such as lutein, beta-carotene, xanthophyll-1, xanthophyll-2, chlorophylls A or B, or alpha-, 
delta- or gamma-tocopherol or the like; fatty acids, such as 16:0 (palmitic acid), 16:1 (palmitoleic 
acid), 18:0 (stearic acid), 18:1 (oleic acid), 18:2 Ginoleic acid), 20:0 , 18:3 (linolenic acid), 20:1 
(eicosenoic acid), 20:2, 22:1 (erucic acid) or the like; waxes, such as by altering the levels of C29, 

25 C31, or C33 alkanes; sterols, such as brassicasterol, campesterol, stigmasterol, sitosterol or 
stigmastanol or the like, glucosinolates, protein or oil levels 

Fatty acids were measured using two methods depending on whether the tissue was from 
leaves or seeds. For leaves, lipids were extracted and esterified with hot methanolic H2S04 and 
partitioned into hexane from methanolic brine. For seed fatty acids, seeds were pulverized and 

30 extracted in methanol:heptane:toluene:2,2-dimethoxypropane:H2S04 (39:34:20:5:2) for 90 
minutes at 80°C. After cooling to room temperature the upper phase, containing the seed fatty 
acid esters, was subjected to GC analysis. Fatty acid esters from both seed and leaf tissues were 
analyzed with a Supelco SP-2330 column. 



» 



37 



WO 01/36597 PCT/US00/31344 

Glucosinolates were purified from seeds or leaves by first heating the tissue at 95°C for 
10 minutes. Preheated ethanol: water (50:50) is and after heating at 95°C for a further 10 minutes, 
the extraction solvent is applied to a DEAE Sephadex column which had been previously 
equilibrated with 0.5 M pyridine acetate. Desulfoglucosinolates were eluted with 300 ul water 
5 and analyzed by reverse phase HPLC monitoring at 226 nm. 

For wax alkanes, samples were extracted using an identical method as fatty acids and 
extracts were analyzed on a HP 5890 GC coupled with a 5973 MSD. Samples were 
chromatographed on a J&W DB35 mass spectrometer (J&W Scientific). 

To measure prenyl lipids levels, seeds or leaves were pulverized with 1 to 2% pyrogallol 

10 as an antioxidant. For seeds, extracted samples were filtered and a portion removed for 

tocopherol and carotenoid/chlorophyll analysis by HPLC. The remaining material was saponified 
for sterol determination. For leaves, an aliquot was removed and diluted with methanol and 
chlorophyll A, chlorophyll B, and total carotenoids measured by spectrophotometry by 
determining absorbance at 665.2 nm, 652.5 nm, and 470 nm. An aliquot was removed for 

1 5 tocopherol and carotenoid/chlorophyll composition by HPLC using a Waters uBondapak C 1 8 
column (4.6 mm x 150 mm). The remaining methanolic solution was saponified with 10% KOH 
at 80°C for one hour. The samples were cooled and diluted with a mixture of methanol and 
water. A solution of 2% methylene chloride in hexane was mixed in and the samples were 
centrifuged. The aqueous methanol phase was again re-extracted 2% methylene chloride in 

20 hexane and, after centrifugation, the two upper phases were combined and evaporated. 2% 

methylene chloride in hexane was added to the tubes and the samples were then extracted with 
one ml of water. The upper phase was removed, dried, and resuspended in 400 ul of 2% 
methylene chloride in hexane and analyzed by gas chromatography using a 50 m DB-5ms (0.25 
mm ED, 0.25 um phase, J&W Scientific). 

25 Insoluble sugar levels were measured by the method essentially described by Reiter et al., 

Plant Journal 12:335-345. This method analyzes the neutral sugar composition of cell wall 
polymers found in Arabidopsis leaves. Soluble sugars were separated from sugar polymers by 
extracting leaves with hot 70% ethanol. The remaining residue containing the insoluble 
polysaccharides was then acid hydrolyzed with allose added as an internal standard. Sugar 

30 monomers generated by the hydrolysis were then reduced to the corresponding alditols by 

treatment with NaBH4, then were acetylated to generate the volatile alditol acetates which were 
then analyzed by GC-FID. Identity of the peaks was determined by comparing the retention times 
of known sugars converted to the corresponding alditol acetates with the retention times of peaks 
from wild-type plant extracts. Alditol acetates were analyzed on a Supelco SP-2330 capillary 
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column (30 m x 250 urn x 0.2 urn) using a temperature program beginning at 180° C for 2 
minutes followed by an increase to 220° C in 4 minutes. After holding at 220° C for 10 minutes, 
the oven temperature is increased to 240° C in 2 minutes and held at this temperature for 10 
minutes and brought back to room temperature. 
5 To identify plants with alterations in total seed oil or protein content, 150mg of seeds 

from T2 progeny plants were subjected to analysis by Near Infrared Reflectance (NIR) using a 
Foss NirSystems Model 6500 with a spinning cup transport system. 

Table 3 shows the phenotypes observed for particular overexpressor or knockout plants 
and provides the SEQ ID No., the internal reference code (GID), whether a knockout or 
10 overexpressor plant was analyzed and the observed phenotype. 



Table 3 



SEQ ID No. 


GID 


Knockout (KO) or 
overexpressor (OE) 


Phenotype observed 


1 


G214 


OE 


[ncrease in leaf fatty acids, for example 100% increase in 
18:0 fatty acid. Also up to 100% increase in leaf 
chlorophyll and 100% increase in leaf carotenoids 


3 


G231 


OE 


Up to 5% increase in leaf 1 8:3 fatty acid 


5 


G274 


OE 


Up to 50% increase in leaf arabinose 


7 


G307 


OE 


Altered in leaf insoluble sugars, for example up to 44% 
decrease in mannose. 


9 


G346 


OE 


Altered leaf fatty acids, for example 25% increase in 16:3 
and altered insoluble sugars, for example up to 25% 
increase in fucose 


11 


G598 


OE 


Altered in insoluble sugars, for example up to 20% 
decrease in rhamnose and up to 10% increase in galactose 


13 


G605 


OE 


Altered in leaf fatty acids, for example up to 20% 
increase in 16:1 fatty acid. 


15 


Gill 


OE 


Altered in insoluble sugars, for example up to 60% 
increase in leaf rhamnose 


17 


G869 


OE 


Alteration in leaf fatty acids eg up to 39% decrease in 
16:0 fatty acid; up to 43% increase in fucose 


19 


G1133 


OE 


Up to 34% decrease in leaf lutein 


21 


G1266 


OE 


Alteration in leaf fatty acids, for example up to 50% 
increase in 18:0 fatty acid. Alterations in leaf insoluble 
sugars, for example a 45% decrease in rhamnose 


23 


G1324 


OE 


Up to 65% decrease in leaf lutein and up to 84% increase 
in leaf xanthophyll 
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25 


G1337 


OE 


Alteration in leaf fatty acids, for example up to 28% 
increase in 18:1 fatty acid 


27 


G975 


OE 


Up to 13 -fold increase in wax in leaves 



For a particular overexpressor that shows a less beneficial biochemical characteristic, it 
may be more useful to select a plant with a decreased expression of the particular transcription 
factor. For a particular knockout that shows a less beneficial biochemical characteristic, it may be 
5 more useful to select a plant with an increased expression of the particular transcription factor. 



EXAMPLE VIU. IDENTIFICATION OF HOMOLOGOUS SEQUENCES 

Homologous sequences from Arabidopsis and plant species other than Arabidopsis were 
identified using database sequence search tools, such as the Basic Local Alignment Search Tool 
(BLAST) (Altschul et al. (1990) J. Mol. Biol 215:403-410; and Altschul et al. (1997) Nucl. Acid 
10 Res. 25: 3389-3402). The tblastx sequence analysis programs were employed using the 

BLOSUM-62 scoring matrix (Henikoff, S. and Henikoff, J. G. (1992) Proc. Natl. Acad. Sci. USA 
89: 10915-10919). 

Identified Arabidopsis homologous sequences are provided in Figure 2 and included in 
the Sequence Listing. The percent sequence identity among these sequences is as low as 47% 

15 sequence identity. Additionally, the entire NCBI GenBank database was filtered for sequences 
from all plants except Arabidopsis thaliana by selecting all entries in the NCBI GenBank 
database associated with NCBI taxonomic ID 33090 (Viridiplantae; all plants) and excluding 
entries associated with taxonomic ID 3701 {Arabidopsis thaliana). These sequences were 
compared to sequences representing genes of SEQ IDs Nos. 1 -54 on 9/26/2000 using the 

20 Washington University TBLASTX algorithm (version 2.0al9MP). For each gene of SEQ IDs 
Nos. 1-54, individual comparisons were ordered by probability score (P-value), where the score 
reflects the probability that a particular alignment occurred by chance. For example, a score of 
3.6e-40 is 3.6 x 10* 40 . For up to ten species, the gene with the lowest P-value (and therefore the 
most likely homolog) is listed in Figure 3. 

25 In addition to P-values, comparisons were also scored by percentage identity. Percentage 

identity reflects the degree to which two segments of DNA or protein are identical over a 
particular length. The ranges of percent identity between the non-Arabidopsis genes shown in 
Figure 3 and the Arabidopsis genes in the sequence listing are: SEQ ED No. 1: 38%-89%; SEQ ID 
No. 3: 64%-88%; SEQ ID No. 5: 44%-84%; SEQ ID No. 7: 35%-86%; SEQ ID No. 9: 43%-77%; 

30 SEQ ID No. 1 1 : 43%-85%; SEQ ID No. 13: 41%-76%; SEQ ID No. 15: 34%-63%; SEQ ID No. 
17: 31%-68%; SEQ ED No. 19: 26%^4%; SEQ ED No. 21: 52%-70%; SEQ ID No. 23: 37%- 
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93%; SEQ ID No. 25: 37%-58%; SEQ ED No. 27: 48%-92%; SEQ ID No. 29: 42%-88%; SEQ ED 
No. 31: 47%-90%; SEQ ED No. 33: 45%-69%; SEQ ED No. 35: 42%-94%; SEQ ED No. 37: 38%- 
85%; SEQ ED No. 39: 49%-93%; SEQ ID No. 41 : 36%-64%; and SEQ ED No. 43: 36%-70%. 

The polynucleotides and polypeptides in the Sequence Listing and the identified 
homologous sequences may be stored in a computer system and have associated or linked with 
the sequences a function, such as that the polynucleotides and polypeptides are useful for 
modifying the biochemical characteristics of a plant. 

All references, publications, patents and other documents herein are incorporated by 
reference in their entirety for all purposes. Although the invention has been described with 
reference to the embodiments and examples above, it should be understood that various 
modifications can be made without departing from the spirit of the invention. 
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What is claimed is: 

1. A transgenic plant with a modified biochemical characteristic, which plant comprises a 
recombinant polynucleotide comprising a nucleotide sequence selected from the group consisting 
of: 

5 (a) a nucleotide sequence encoding a polypeptide comprising a sequence selected from 

SEQ ID Nos. 2N, where N=l-22, or a complementary nucleotide sequence thereof; 

(b) a nucleotide sequence encoding a polypeptide comprising a conservatively substituted 
variant of a polypeptide of (a); 

(c) a nucleotide sequence comprising a sequence selected from those of SEQ ID Nos. 2N- 
10 1, where N=l-22, or a complementary nucleotide sequence thereof; 

(d) a nucleotide sequence comprising silent substitutions in a nucleotide sequence of (c); 

(e) a nucleotide sequence which hybridizes under stringent conditions to a nucleotide 
sequence of one or more of: (a), (b), (c), or (d); 

(f) a nucleotide sequence comprising at least 15 consecutive nucleotides of a sequence of 
15 any of (a)-(e); 

(g) a nucleotide sequence comprising a subsequence or fragment of any of (a)-(f), which 
subsequence or fragment encodes a polypeptide that modifies a plant's biochemical 
characteristic; 

(h) a nucleotide sequence having at least 3 1% sequence identity to a nucleotide sequence 
20 of any of (a)-(g); 

(i) a nucleotide sequence having at least 60% identity sequence identity to a nucleotide 
sequence of any of (a)-(g); 

(j) a nucleotide sequence which encodes a polypeptide having at least 31% identity 
sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l-22; 
25 (k) a nucleotide sequence which encodes a polypeptide having at least 60% identity 

sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l-22; and 
(1) a nucleotide sequence which encodes a polypeptide having at least 65% sequence 
identity to a conserved domain of a polypeptide of SEQ ID Nos. 2N, where N=l-22. 

30 2. The transgenic plant of claim 1 , further comprising a constitutive, inducible, or tissue- 
active promoter operably linked to said nucleotide sequence. 

3. The transgenic plant of claim 1, wherein the plant is selected from the group consisting 
of: soybean, wheat, com, potato, cotton, rice, oilseed rape, sunflower, alfalfa, sugarcane, turf, 
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banana, blackberry, blueberry, strawberry, raspberry, cantaloupe, carrot, cauliflower, coffee, 
cucumber, eggplant, grapes, honeydew, lettuce, mango, melon, onion, papaya, peas, peppers, 
pineapple, spinach, squash, sweet corn, tobacco, tomato, watermelon, rosaceous fruits, and 
vegetable brassicas. 



4. An isolated or recombinant polynucleotide comprising a nucleotide sequence selected 
from the group consisting of: 

(a) a nucleotide sequence encoding a polypeptide comprising a sequence selected from 
SEQ ID Nos. 2N, where N=l-22, or a complementary nucleotide sequence thereof; 

(b) a nucleotide sequence encoding a polypeptide comprising a conservatively substituted 
variant of a polypeptide of (a); 

(c) a nucleotide sequence comprising a sequence selected from those of SEQ ID Nos. 2N- 
1, where N=l-22, or a complementary nucleotide sequence thereof; 

(d) a nucleotide sequence comprising silent substitutions in a nucleotide sequence of (c); 

(e) a nucleotide sequence which hybridizes under stringent conditions to a nucleotide 
sequence of one or more of: (a), (b), (c), or (d); 

(f) a nucleotide sequence comprising at least 15 consecutive nucleotides of a sequence of 
any of (a)-{e); 

(g) a nucleotide sequence comprising a subsequence or fragment of any of (a)-(f), which 
subsequence or fragment encodes a polypeptide that modifies a plant's biochemical 
characteristic; 

(h) a nucleotide sequence having at least 31% sequence identity to a nucleotide sequence 
of any of (a)-(g); 

(i) a nucleotide sequence having at least 60% identity sequence identity to a nucleotide 
sequence of any of (a)-(g); 

(j) a nucleotide sequence which encodes a polypeptide having at least 31% identity 

sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l-22; 

(k) a nucleotide sequence which encodes a polypeptide having at least 60% identity 

sequence identity to a polypeptide of SEQ ID Nos. 2N, where N=l-22; and 

(1) a nucleotide sequence which encodes a conserved domain of a polypeptide having at 

least 65% sequence identity to a conserved domain of a polypeptide of SEQ ID Nos. 2N, 

where N=l-22. 



43 



WO 01/36597 



PCT7US00/31344 



5. The isolated or recombinant polynucleotide of claim 4, further comprising a constitutive, 
inducible, or tissue-active promoter operably linked to the nucleotide sequence. 

6. A cloning or expression vector comprising the isolated or recombinant polynucleotide of 
5 claim 4. 

7. A cell comprising the cloning or expression vector of claim 6. 



10 



20 



8. A transgenic plant comprising the isolated or recombinant polynucleotide of claim 4. 



9. A composition produced by one or more of: 

(a) incubating one or more polynucleotide of claim 4 with a nuclease; 

(b) incubating one or more polynucleotide of claim 4 with a restriction enzyme; 

(c) incubating one or more polynucleotide of claim 4 with a polymerase; 

15' (d) incubating one or more polynucleotide of claim 4 with a polymerase and a primer; 

(e) incubating one or more polynucleotide of claim 4 with a cloning vector, or 

(f) incubating one or more polynucleotide of claim 4 with a cell. 



10. A composition comprising two or more different polynucleotides of claim 4. 

11. An isolated or recombinant polypeptide comprising a subsequence of at least about 1 5 
contiguous amino acids encoded by the recombinant or isolated polynucleotide of claim 4. 



12. A plant ectopically expressing an isolated polypeptide of claim 1 1 . 

25 

13. A method for producing a plant having a modified biochemical characteristic, the method 
comprising altering the expression of the isolated or recombinant polynucleotide of claim 4 or the 
expression levels or activity of a polypeptide of claim 1 1 in a plant, thereby producing a modified 
plant, and selecting the modified plant for a modified biochemical characteristic thereby 

30 providing the modified plant with a modified biochemical characteristic. 

14. The method of claim 1 3 , wherein the polynucleotide is a polynucleotide of claim 4. 
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15. A method of identifying a factor that is modulated by or interacts with a polypeptide 
encoded by a polynucleotide of claim 4, the method comprising: 

(a) expressing a polypeptide encoded by the polynucleotide in a plant; and 

(b) identifying at least one factor that is modulated by or interacts with the polypeptide. 

5 

1 6. The method of claim 15, wherein the identifying is performed by detecting binding by the 
polypeptide to a promoter sequence, or detecting interactions between an additional protein and 
the polypeptide in a yeast two hybrid system. 

10 17. The method of claim 1 5 , wherein the identifying is performed by detecting expression of 
a factor by hybridization to a microarray, subtractive hybridization or differential display. 

18. A method of identifying a molecule that modulates activity or expression of a 
polynucleotide or polypeptide of interest, the method comprising: 

1 5 (a) placing the molecule in contact with a plant comprising the polynucleotide or 

polypeptide encoded by the polynucleotide of claim 4; and, 
(b) monitoring one or more of: 

(i) expression level of the polynucleotide in the plant; 

(ii) expression level of the polypeptide in the plant; 

20 (iii) modulation of an activity of the polypeptide in the plant; or 

(iv) modulation of an activity of the polynucleotide in the plant. 

19. An integrated system, computer or computer readable medium comprising one or more 
character strings corresponding to a polynucleotide of claim 4, or to a polypeptide encoded by the 

25 polynucleotide. 

20. The integrated system, computer or computer readable medium of claim 1 9, further 
comprising a link between said one or more sequence strings to a modified plant biochemical 
characteristics phenotype. 



30 



21. A method of identifying a sequence similar or homologous to one or more 
polynucleotides of claim 4, or one or more polypeptides encoded by the polynucleotides, the 
method comprising: 

(a) providing a sequence database; and, 
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(b) querying the sequence database with one or more target sequences corresponding to 
the one or more polynucleotides or to the one or more polypeptides to identify one or . 
more sequence members of the database that display sequence similarity or homology to 
one or more of the one or more target sequences. 

5 

22. The method of claim 2 1 , wherein the querying comprises aligning one or more of the 
target sequences with one or more of the one or more sequence members in the sequence 
database. 

10 23 . The method of claim 2 1 , wherein the querying comprises identifying one or more of the 
one or more sequence members of the database that meet a user-selected identity criteria with one 
or more of the target sequences. 

24. The method of claim 21 , further comprising linking the one or more of the 

1 5 polynucleotides of claim 4, or encoded polypeptides, to a modified plant biochemical 
characteristics phenotype. 

25. A plant comprising altered expression levels of an isolated or recombinant polynucleotide 
of claim 4. 

20 

26. A plant comprising altered expression levels or the activity of an isolated or recombinant 
polypeptide of claim 11. 

27. A plant lacking a nucleotide sequence encoding a polynucleotide of claim 1 1 . 

25 



46 



WO 01/36597 PCT/USOO/31344 

1/7 



Figure 1 



SEQ ID No. 


GID 


cDNA or protein 


conserved domain 


1 


G214 


cDNA 




2 


G214 


protein 


22-71 


3 


G231 


cDNA 




4 


G231 


protein 


14-118 


5 


G274 


cDNA 




6 


G274 


protein 


108-572 


7 


G307 


cDNA 




8 


G307 


protein 


323-339 


9 


G346 


cDNA 




10 


G346 


protein 


196-221 


11 


G598 


cDNA 




12 


G598 


protein 


205-263 


13 


G605 


cDNA 




14 


G605 


protein 


132-143 


15 


G777 


cDNA 




16 


G777 


protein 


47-101 


17 


G869 


cDNA 




18 


G869 


protein 


109-177 


19 


G1133 


cDNA 




20 


G1 133 


protein 


256-326 


21 


G1266 


cDNA 




22 


G1266 


protein 


79-147 


23 


G1324 


cDNA 




24 


G1324 


protein 


20-118 


25 


G1337 


cDNA 




26 


G1337 


protein 


9-75 


27 


G975 


cDNA 




28 


G975 


protein 


4-71 



WO 01/36597 



PCT/US00/31344 



Figure 2 



OCA in Mrt 

bcU 1U NO. 




nornoiog 


cuinm or protein 


conserveo uomain 


29 


G680 


homolog of G214 


cDNA 




30 , 


G680 


homolog of G214 


protein 


24-70 


31 


G883 


homotog of G274 


cDNA 




32 


G883 


homolog of G274 


protein 


245-302 


33 


G1855 


homolog of G274 


cDNA 




34 


G1855 


homolog of G274 


protein 


entire protein 


35 


G1190 


homolog of G274 


cDNA 




36 


G1 190 


homolog of Gz74 


protein 


entire proiein 


37 


G308 


homolog of G307 


cDNA 




38 


G308 


homolog of G307 


protein 


270-274 


39 


G1944 


homolog of G605 


cDNA 




40 


G1944 


homolog of G605 


protein 


87-100 


41 


G326 


homolog of G1337 


cDNA 




42 


G326 


homolog of G1337 


protein 


11-94,354-400 


43 


G1387 


homolog of G975 


cDNA 




44 


G1387 


homolog of G975 


protein 


4-71 
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Figure 3A 



SEQ ID No. 


GID 


Genbank NID 


P-value 


Species 


1 


G214 


8170933 


8.80E-35 


Lycopersicon esculentum 


1 


G214 


9205339 


1.20E-27 


Glycine max 


1 


G214 


8577344 


1.80E-23 


Zea mays 


1 


G214 


9119112 


2.40E-18 


Medicago truncatula 


1 


G214 


7660673 


4.80E-15 


Sorghum bicolor 


1 


G214 


8213273 


4.40E-14 


Oryza sativa 


1 


G214 


3325786 


4.70E-10 


Gossypium hirsutum 


1 


G214 


9435251 


1.50E-09 


Hordeum vulgare 


1 


G214 


9411569 


6.80E-09 


Triticum aestivum 


1 


G214 


7614730 


3.00E-07 


Lotus japonicus 


3 


G231 


6651291 


7.80E-71 


Pimpinella brachycarpa 


3 


G231 


1430845 


1.90E-62 


Lycopersicon esculentum 


3 


G231 


5268844 


1.40E-61 


Zea mays 


3 


G231 


7561750 


3.90E-60 


Medicago truncatula 


3 


G231 


19452B2 


3.30E-59 


Oryza sativa 


3 


G231 


22637 


9.80E-49 


Physcomitreila patens 


3 


G231 


437326 


2.00E-48 


Gossypium hirsutum 


3 


G231 


20562 


3.40E-48 


Petunia x hybrida 


3 


G231 


4886263 


5.00E-48 


Antirrhinum majus 


3 


G231 


8379692 


1.50E-47 


Gossypium arboreum 


5 


G274 


6752887 


1.70E-231 


Malus domestica 


5 


G274 


5734616 


1.20E-140 


Oryza sativa 


5 


G274 


8996178 


5.40E-96 


Suaeda maritima subsp. salsa 


5 


G274 


6654657 


1.50E-89 


Medicago truncatula 


5 


G274 


8105703 


2.30E-88 


Lycopersicon esculentum 


5 


G274 


7625402 


4.00E-87 


Gossypium arboreum 


5 


G274 


7588836 


2.10E-82 


Glycine max 


5 


G274 


5045979 


1.30E-76 


Gossypium hirsutum 


5 


G274 


7324635 


1.90E-71 


Lycopersicon pennellii 


5 


G274 


8903627 


3.60E-63 


Hordeum vulgare 


7 


G307 


5640156 


3.80E-151 


Triticum aestivum 


7 


G307 


5640154 


1.00E-101 


Zea mays 


7 


G307 


6970471 


1.70E-97 


Oryza sativa 


7 


G307 


7718432 


4.00E-82 


Medicago truncatula 


7 


G307 


8330344 


7.90E-78 


Mesembryanthemum crystallinum 


7 


G307 


5047560 


1.00E-72 


Gossypium hirsutum 


7 


G307 


7588689 


2.70E-69 


Glycine max 


7 


G307 


7623983 


2.20E-64 


Gossypium arboreum 


7 


G307 


7780253 


9.30E-59 


Lotus japonicus 


7 


G307 


6733213 


1.90E-51 


Lycopersicon esculentum 


9 


G346 


4387642 


5.90E-28 


Lycopersicon esculentum 


9 


G346 


7627902 


1.50E-27 


Gossypium arboreum 


9 


G346 


8335147 


6.40E-27 


Oryza sativa 


9 


G346 


8529362 


9.10E-27 


Medicago truncatula 


9 


G346 


403305 


2.30E-26 


Nicotiana tabacum 


9 


G346 


9299618 


2.50E-26 


Sorghum bicolor 


9 


G346 


5056246 


7.80E-26 


Brassica rapa subsp. pekinensis 


9 


G346 


6827291 


6.80E-25 


Zea mays 


9 


G346 


6567406 


1.90E-24 


Glycine max 


9 


G346 


9425896 


1.20E-21 


Triticum turgidum subsp. durum 


11 


G598 


8102670 


1.30E-43 


Zea mays 


11 


G598 


4382198 


9.80E-42 


Lycopersicon esculentum 
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Figure 3B 



OCA m Mr. 




OcriL/allK INIU 


r-vaiU6 


opecies 


1 1 


G598 


7553316 


8.00E-38 


Sorghum bicolor 


A\ A 

11 


^** mo 

G598 


9445834 


3.10E-36 


Triticum aestivum 


A J 

11 


poo 
G598 


to 0 0 poo 

7332502 


n nor** 00 

8.80E-30 


Oryza sativa 


1 1 


f-\ po o 

G598 


9056816 


1.70E-17 


Medicago truncatula 


A A 
1 1 


G598 


r> a a Ton 

6644720 


C OAT H e 

5.20E-15 


Mesembryanthemum crystallinum 


1 1 


G598 


3853398 


O OAT 4 y* 

2.20E-14 


Populus tremuia x Populus tremuloides 


•4 A 
1 1 


G598 


9419408 


6.80E-09 


Hordeum vulgare 


1 1 


G598 


CO A OIOO. 

6848223 


1 .40b-0o 


Glycine max 


A *> 

13 


G605 


7624850 


4.40c-49 


Gossypium arboreum 


13 


/"**oo.p 

G605 


9204125 


6.50E-46 


Glycine max 


13 


G605 


2213533 


r- c-or- 00 

5.50E-33 


Pisum sativum 


13 


G605 


*t/\oo iin 

7009437 


1 .40E-28 


Zea mays 


13 


a^s a /\ 

G605 


A ^ a j a A 

8104258 


3.50E-28 


Lycopersicon esculentum 


13 


G605 


7536402 


4.10E-28 


Sorghum bicolor 


13 


a^\ a a 

G605 


3107210 


1 .60E-22 


Oryza sativa 


13 


G605 


7784135 


A AA^™ A A 

9.20E-20 


Lotus japonicus 


13 


G605 


a a a ^ a a a 

4165182 


A A A r - J A 

8.30E-18 


Antirrhinum majus 


13 


G605 


J% W* ^ ^ A A A 

6555294 


A ^ A ¥~ A ^9 

8.10E-17 


Pinus taeda 


15 


Gill 


8172576 


3.10E-29 


Medicago truncatula 


15 


Gill 


8331320 


4.60E-17 


Mesembryanthemum crystallinum 


15 


Gill 


f\ a A A ^ A A 

8106138 


A A A ^ A 

3.00E-16 


Lycopersicon esculentum 


15 


Gill 


5046832 


1.20E-14 


Gossypium hirsutum 


15 


Gill 


6918785 


1.70E-13 


Zea mays 


15 


Gill 


^ A*% A*% A*\ A A 

5666914 


1.30E-07 


Glycine max 


15 


Gill 


8856987 


A A A 

0.98 


Oryza sativa 


15 


Gill 


8404755 


1 


Hordeum vulgare 


17 


G869 


2213784 


1.30E-19 


Lycopersicon esculentum 


17 


G869 


AAA^AA A 

3065894 


no r** ^ 0, 

7.30E-19 


Nicotiana tabacum 


17 


G869 


8570080 


4.20E-18 


Oryza sativa 


17 


^> nno 

G869 


7560260 


a c n t~ A *T 

1.50E-17 


Medicago truncatula 


17 


y*% noo 

G869 


***po ^ noo 

7534890 


poor - «4 ji 

5.20E-14 


Sorghum bicolor 


17 


G869 


6455322 


A A ft r~ 4 0 

1.10E-13 


Glycine max 


17 


a a a 

G869 


^\ A AAA A ^ 

9362061 


2.70E-13 


Triticum aestivum 


17 


G869 


7788764 


5.70E-13 


Lotus japonicus 


17 


G869 


-^B A A*% A^ Af% 

7624302 


2.50E-12 


Gossypium arboreum 


17 


G869 


3858036 


2.80E-12 


Populus balsamifera subsp. trichocarpa 


19 


G1133 


8070726 


1.30E-16 


Solanum tuberosum 


19 


G1133 


6848196 


1.60E-16 


Glycine max 


19 


G1133 


7570922 


3.60E-13 


Medicago truncatula 


19 


G1133 


9434859 


1.90E-12 


Lycopersicon esculentum 


19 


a^\ a a a a 

G1133 


^ f\ a A A A 

5704484 


O OOP 

0.005 


Oryza sativa 


A A 

19 


G1133 


A A A A A A 

902661 


O OOO A 

0.0081 


Hordeum vulgare 


19 


G1133 


A A A A mt A J 

8666194 


r\ 0000 

0.0086 


Pinus taeda 


19 


G1133 


5725018 


O A A 

0.14 


Brassica rapa subsp. pekinensis 


19 


-4 A oo 
G1100 


7CAH nc 4 


U.D4 


oossypium arooreum 


19 


G1133 


7747388 


0.98 


Lotus japonicus 


21 


G1266 


1732405 


1.50E-50 


Nicotiana tabacum 


21 


G1266 


7145976 


2.50E-38 


Glycine max 


21 


G1266 


3326366 


1.00E-37 


Gossypium hirsutum 


21 


G1266 


5762854 


6.90E-37 


Lotus japonicus 


21 


G1266 


7560749 


9.10E-34 


Medicago truncatula 


21 


G1266 


7934594 


6.60E-33 


Euphorbia esula 


21 


G1266 


9431305 


2.10E-28 


Lycopersicon esculentum 
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Figure 3C 



SEQ ID No. 


GID 


Genbank NID 


P-value 


Species 


21 


G1266 


7528275 


5.40E-21 


Mesembryanthemum crystallinum 


21 


G1266 


6478844 


4.10E-20 


Matricaria chamomilla 


21 


G1266 


7627061 


4.20E-20 


Gossypium arboreum 


23 


G1324 


2921337 


2.30E-54 


Gossypium hirsutum 


23 


G1324 


5891412 


3.50E-52 


Lycopersicon esculentum 


23 


G1324 


8528843 


7.20E-50 


Medicago truncatula 


23 


G1324 


1002797 


5.40E-49 


Craterostigma plantagineum 


23 


G1324 


5666961 


3.90E-44 


Glycine max 


23 


G1324 


7244640 


1.70E-42 


Mentha x piperita 


23 


G1324 


1841474 


3.00E-42 


Pisum sativum 


23 


G1324 


4979554 


1.30E-39 


Oryza sativa 


23 


G1324 


9363368 


3.00E-32 


Triticum aestivum 


23 


G1324 


9296080 


3.50E-32 


Sorghum bicolor 


25 


G1337 


7410432 


2.60E-41 


Lycopersicon esculentum 


25 


G1337 


3618319 


1.10E-32 


Oryza sativa 


25 


G1337 


7571599 


1.00E-28 


Medicago truncatula 


25 


G1337 


7685955 


5.10E-27 


Glycine max 


25 


G1337 


7323708 


2.60E-25 


Lycopersicon hirsutum 


25 


G1337 


4091805 


1.00E-18 


Malus domestica 


25 


G1337 


6917805 


4.80E-18 


Lycopersicon pennellii 


25 


G1337 


3341722 


1.60E-17 


Raphanus sativus 


25 


G1337 


2303680 


4.50E-17 


Brassica napus 


25 


G1337 


4557092 


9.10E-17 


Pinus radiata 


27 


G975 


8103850 


8.50E-46 


Lycopersicon esculentum 


27 


G975 


7590215 


1.50E-45 


Glycine max 


27 


G975 


5056299 


2.20E-34 


Brassica rapa subsp. pekinensis 


27 


G975 


9278522 


1.80E-26 


Lotus japonicus 


27 


G975 


1128767 


2.70E-18 


Brassica rapa 


27 


G975 


5859978 


5.50E-18 


Pinus taeda 


27 


G975 


9427282 


2.40E-15 


Triticum aestivum 


27 


G975 


19506 


4.70E-14 


Lupinus polyphyilus 


27 


G975 


6799584 


5.30E-14 


Medicago truncatula 


27 


G975 


7324705 


1.70E-12 


Lycopersicon pennellii 


29 


G680 


9258166 


5.70E-36 


Glycine max 


29 


G680 


9255178 


3.00E-29 


Zea mays 


29 


G680 


5274804 


1.20E-27 


Lycopersicon esculentum 


29 


G680 


4974199 


3.00E-22 


Oryza sativa 


29 


G680 


3325786 


2.10E-21 


Gossypium hirsutum 


29 


G680 


9119112 


1.30E-18 


Medicago truncatula 


29 


G680 


7660673 


3.20E-17 


Sorghum bicolor 


29 


G680 


7243970 


6.10E-16 


Mentha x piperita 


29 


G680 


3858093 


2.10E-10 


Populus balsamifera subsp. trichocarpa 


29 


G680 


8845091 


3.70E-1O 


Triticum aestivum 


31 


G883 


4760595 


2.40E-84 


Nicotiana tabacum 


31 


G883 


4894962 


3.50E-45 


Avena sativa 


31 


G883 


6719425 


1.70E-36 


Glycine max 


31 


G883 


5273248 


2.80E-35 


Lycopersicon esculentum 


31 


G883 


9302479 


3.00E-34 


Sorqhum bicolor 


31 


G883 


6799932 


1.40E-31 


Medicago truncatula 


31 


G883 


5456433 


4.30E-31 


Zea mays 


31 


G883 


8706346 


1.40E-30 


Hordeum vulgare 


31 


G883 


8404566 


2.70E-30 


Oryza sativa 


31 


G883 


1432055 


2.00E-27 


Petroselinum crispum 
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Figure 3D 



ceo in k\<*\ 


r>tr\ 

olU 


oenoariK iniu 


r-vaiue 


opecies 


33 


G1855 


6752887 


4.80E-181 


Malus domestica 


33 


G1855 


5734616 


7.60E-154 


Oryza sativa 


33 


G1855 


4384552 


3.80E-80 


Lycopersicon esculentum 


33 


G1855 


8996178 


1 .80E-78 


Suaeda maritima subsp. salsa 


33 


G1855 


7625402 


1.60E-77 


Gossypium arboreum 


33 


G1855 


8903627 


3.80E-74 


Hordeum vulgare 


33 


G1855 


6654657 


2.20E-70 


Medicago truncatula 


33 


G1855 


8090141 


4.50E-64 


Sorghum bicolor 


33 


G1855 


9028645 


6.30E-64 


Zea mays 


33 


G1855 


*V #■ #^ A f\ A -A 

7588836 


6.70E-62 


Glycine max 


35 


G1190 


6752887 


7.00E-111 


Malus domestica 


35 


G1190 


5734616 


1 .20E-98 


Oryza sativa 


35 


G1190 


7569650 


2.60E-92 


Medicago truncatula 


35 


G1190 


4380101 


5.50E-88 


Lycopersicon esculentum 


35 


G1190 


6567183 


5.20E-81 


Glycine max 


35 


G1190 


8901706 


7.40E-80 


ii i i 

Hordeum vulgare 


35 


G1190 


8070121 


1 .70E-76 


Solanum tuberosum 


35 


G1190 


8666639 


5.50E-75 


Pinus taeda 


35 


G1190 


8088688 


3.40E-72 


Sorghum bicolor 


35 


G1190 


6020980 


6.50E-67 


Zea mays 


37 


G308 


5640156 


3.50E-162 


Triticum aestivum 


37 


G308 


5640154 


2.30E-134 


Zea mays 


37 


G308 


6970471 


4.20E-120 


Oryza sativa 


37 


G308 


7718432 


8.70E-80 


Medicago truncatula 


37 


G308 


8330344 


3.90E-76 


Mesembryanthemum crystallinum 


37 


G308 


5047560 


1.50E-71 


Gossypium hirsutum 


37 


G308 


7588689 


1.90E-68 


Glycine max 


37 


G308 


7623983 


2.90E-62 


Gossypium arboreum 


37 


G308 


7780253 


1.10E-57 


Lotus japonicus 


37 


G308 


6733213 


3.70E-48 


Lycopersicon esculentum 


39 


G1944 


9204125 


5.50E-52 


Glycine max 


39 


G1944 


7624850 


6.60E-45 


Gossypium arboreum 


39 


G1944 


7784135 


7.20E-32 


Lotus japonicus 


39 


G1944 


9280727 


2.60E-29 


Oryza sativa 


39 


G1944 


7009437 


1.30E-28 


Zea mays 


39 


G1944 


7536402 


1.30E-28 


Sorghum bicolor 


39 


G1944 


8104258 


6.50E-27 


Lycopersicon esculentum 


39 


G1944 


2213533 


3.50E-23 


Pisum sativum 


39 


G1944 


4165182 


7.10E-17 


Antirrhinum majus 


39 


G1944 


6555294 


2.90E-16 


Pinus taeda 


41 


G326 


7410432 


1.10E-37 


Lycopersicon esculentum 


41 


G326 


3618319 


2.90E-32 


Oryza sativa 


41 


G326 


7571599 


4.90E-30 


Medicago truncatula 


41 


G326 


7232283 


6.30E-28 


Glycine max 


41 


G326 


7323708 


o.OOt-27 


Lycopersicon nirsutum 


41 


G326 


4091805 


2.30E-19 


Malus domestica 


41 


G326 


6917805 


6.50E-19 


Lycopersicon pennellii 


41 


G326 


3341722 


2.50E-18 


Raphanus sativus 


41 


G326 


4557092 


7.50E-18 


Pinus radiata 


41 


G326 


2303680 


4.70E-17 


Brassica napus 


43 


G1387 


8285738 


1.40E-46 


Glycine max 


43 


G1387 


8103850 


5.20E-46 


Lycopersicon esculentum 


43 


G1387 


5056299 


1.10E-20 


Brassica rapa subsp. pekinensis 
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Figure 3E 



SEQ ID No. 


G1D 


Genbank NID 


P-value 


Species 


43 


G1387 


9278522 


1.50E-18 


Lotus japonicus 


43 


G1387 


5859978 


2.00E-15 


Pinus taeda 


43 


G1387 


7766740 


4.70E-14 


Medicago truncatula 


43 


G1387 


9427282 


1.40E-12 


Triticum aestivum 


43 


G1387 


3857766 


3.40E-12 


Populus balsamifera subsp. trichocarpa 


43 


G1387 


19506 


4.60E-12 


Lupinus polyphyllus 


43 


G1387 


7273843 


2.20E-1 1 


Oryza sativa 
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SEQUENCE LISTING 

<110> Creelman, Robert 
Yu, Guo- Liang 
Adam, Luc 

Riechmann, Jose Luis 
Heard, Jacqueline 
Samaha, Raymond 
Pilgrim, Marsha 
Pineda, Oma i r a 
Jiang, Cai-Zhong 

<120> Plant Biochemistry-Related Genes 

<130> MBI-0020 

<150> 60/164,132 
<151> 1999-11-17 

<150> 60/197,899 

<151> 2000-04-17 

<150> Plant Trait Modification III 

<151> 2000-08-22 

<160> 44 

<170> Patentln version 3.0 

<210> 1 

<211> 2240 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (238) . . (2064) 

<223> G214 

<400> 1 

tgagatttct ccatttccgt agcttctggt ctcttttctt tgtttcattg atcaaaagca 60 

aatcacttct tcttcttctt cttctcgatt tcttactgtt ttcttatcca acgaaatctg 120 

gaattaaaaa tggaatcttt atcgaatcca agctgatttt gtttctttca ttgaatcatc 180 

tctctaaagt ggaattttgt aaagagaaga tctgaagttg tgtagaggag cttagtg 237 

atg gag aca aat teg tct gga gaa gat ctg gtt att aag act egg aag 285 
Met Glu Thr Asn Ser Ser Gly Glu Asp Leu Val lie Lys Thr Arg Lys 
1 5 10 15 

cca tat acg ata aca aag caa cgt gaa agg tgg act gag gaa gaa cat 333 
Pro Tyr Thr lie Thr Lys Gin Arg Glu Arg Trp Thr Glu Glu Glu His 
20 25 30 

aat aga ttc att gaa get ttg agg ctt tat ggt aga gca tgg cag aag 3 81 

Asn Arg Phe lie Glu Ala Leu Arg Leu Tyr Gly Arg Ala Trp Gin Lys 
35 40 45 

att gaa gaa cat gta gca aca aaa act get gtc cag ata aga agt cac 429 
He Glu Glu His Val Ala Thr Lys Thr Ala Val Gin He Arg Ser His 
50 55 60 

get cag aaa ttt ttc tec aag gta gag aaa gag get gaa get aaa ggt 477 
Ala Gin Lys Phe Phe Ser Lys Val Glu Lys Glu Ala Glu Ala Lys Gly 
65 70 75 80 

gta get atg ggt caa gcg eta gac ata get att cct cct cca egg cct 525 
Val Ala Met Gly Gin Ala Leu Asp He Ala He Pro Pro Pro Arg Pro 

85 90 95 

aag cgt aaa cca aac aat cct tat cct cga aag acg gga agt gga acg 573 
Lys Arg Lys Pro Asn Asn Pro Tyr Pro Arg Lys Thr Gly Ser Gly Thr 
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100 



MB I -20 Sequence 
105 



Listing. ST25 
110 



ate ctt 
lie Leu 



atg tea 
Met Ser 
115 



aaa acg 
Lys Thr 



ggt gtg 
Gly Val 
120 



aat gat gga 
Asn Asp Gly 



aaa gag tec ctt gga 
Lys Glu Ser Leu Gly 
125 



621 



tea gaa 
Ser Glu 
130 



aaa gtg 
Lys Val 



teg cat 
Ser His 



cct gag 
Pro Glu 
135 



atg gee aat 
Met Ala Asn 



gaa gat cga caa caa 
Glu Asp Arg Gin Gin 
140 



669 



tea aag 
Ser Lys 
145 

ttc act 
Phe Thr 



cct gaa 
Pro Glu 



cat cag 
His Gin 



gag aaa 
Glu Lys 
150 

tat etc 
Tyr Leu 
165 



act ctg 
Thr Leu 



tct get 
Ser Ala 



cag gaa gac 
Gin Glu Asp 
155 

gca tec tec 
Ala Ser Ser 
170 



aac tgt tea gat 
Asn Cys Ser Asp 



tgt 
Cys 
160 



atg aat aaa agt tgt 
Met Asn Lys Ser Cys 
175 



717 



765 



ata gag 
He Glu 



gaa gag 
Glu Glu 



ttg aat 
Leu Asn 
210 

tat ccg 
Tyr Pro 
225 

agt tct 
Ser Ser 



aca tea 
Thr Ser 
180 

gga agt 
Gly Ser 
195 

gca aaa 
Ala Lys 



atg cat 
Met His 



eta tea 
Leu Ser 



gtt gca 
Val Ala 



gga gat 
Gly Asp 
260 



aac gca 
Asn Ala 



cag aat 
Gin Asn 



tct ctg 
Ser Leu 



ate cct 
lie Pro 
230 

cat cct 
His Pro 
245 

tat cag 
Tyr Gin 



age act 
Ser Thr 



aac agg 
Asn Arg 
200 

gaa aac 
Glu Asn 
215 

gtg eta 
Val Leu 



cct tea 
Pro Ser 



teg ttt 
Ser Phe 



ttc cgc gag 
Phe Arg Glu 
185 

gta aga aag 
Val Arg Lys 



ggt aat gag 
Gly Asn Glu 



gtg cca ttg 
Val Pro Leu 
235 

gag cca gat 
Glu Pro Asp 
250 

cct aat cat 
Pro Asn His 
265 



ttc ttg cct tea egg 
Phe Leu Pro Ser Arg 
190 

gag tea aac tea gat 
Glu Ser Asn Ser Asp 
205 

caa gga cct cag act 
Gin Gly Pro Gin Thr 
220 



ggg age tea ata 
Gly Ser Ser He 



aca 
Thr 
240 



agt cat ccc cac aca 
Ser His Pro His Thr 
255 

ata atg tea acc ctt 
He Met Ser Thr Leu 
270 



813 



861 



909 



957 



1005 



1053 



tta caa 
Leu Gin 



tgg cct 
Trp Pro 
290 

ccg aat 
Pro Asn 
305 

tgg tgg 
Trp Trp 



aca ccg 
Thr Pro 
275 

ccc gat 
Pro Asp 



ctg get 
Leu Ala 



get gee 
Ala Ala 



ggt ggt 
Gly Gly 



gta gag 
Val Glu 



cga gag 
Arg Glu 
370 

tea gag 
Ser Glu 
385 



ttc act 
Phe Thr 
340 

tac aca 
Tyr Thr 
355 

caa gaa 
Gin Glu 



get ctt 
Ala Leu 



tct agt 
Ser Ser 



gee atg 
Ala Met 
310 

aat gga 
Asn Gly 
325 

agt cat 
Ser His 



aaa gca 
Lys Ala 



cac tec 
His Ser 



tat act 
Tyr Thr 
280 

ggt ggc 
Gly Gly 
295 

gee gca 
Ala Ala 



tta tta 
Leu Leu 



cct cca 
Pro Pro 



gat gtt 
Asp Val 



gaa aat 
Glu Asn 
390 



age act 
Ser Thr 
360 

gag gca 
Glu Ala 
375 

aag agt 
Lys Ser 



gee gca act 
Ala Ala Thr 



tea cct gtt 
Ser Pro Val 



gee act gtt 
Ala Thr Val 
315 

cct tta tgt 
Pro Leu Cys 
330 

tct act ttt 
Ser Thr Phe 
345 

tta caa cat 
Leu Gin His 



ttc gee tea tea ttt 
Phe Ala Ser Ser Phe 
285 



cca ggg aac tea 
Pro Gly Asn Ser 
300 

gca get get agt 
Ala Ala Ala Ser 



get cct ctt agt 
Ala Pro Leu Ser 
335 



cct 
Pro 



get 
Ala 
320 

tea 
Ser 



tct gca aca cct gag agt gat gca 



tea aag get 
Ser Lys Ala 



aaa cca gtt 
Lys Pro Val 
395 

aag ggt tea 
Page 



1101 



1149 



1197 



1245 



gga cca tea tgt gat 1293 
Gly Pro Ser Cys Asp 
350 

ggt tct gtg cag age 1341 
Gly Ser Val Gin Ser 
365 

cga tct tea ctg gac 1389 

Arg Ser Ser Leu Asp 

380 

tgt cat gag cag cct 1437 
Cys His Glu Gin Pro 
400 

gat gga gca gga gac 14 85 

2 
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MBI-20 Sequence Listing. ST25 
Ser Ala Thr Pro Glu Ser Asp Ala Lys Gly Ser Asp Gly Ala Gly Asp 
405 410 415 

aga aaa caa gtt gac egg tec ccg tgt ggc tea aac act ccg teg agt 1533 
Arg Lys Gin Val Asp Arg Ser Ser Cys Gly Ser Asn Thr Pro Ser Ser 
420 425 430 

agt gat gat gtt gag gcg gat gca tea gaa agg caa gag gat ggc acc 1581 
Ser Asp Asp Val Glu Ala Asp Ala Ser Glu Arg Gin Glu Asp Gly Thr 
435 440 445 

aat ggt gag gtg aaa gaa acg aat gaa gac act aat aaa cct caa act 162 9 
Asn Gly Glu Val Lys Glu Thr Asn Glu Asp Thr Asn Lys Pro Gin Thr 
450 455 460 

tea gag tec aat gca cgc cgc agt aga ate age tec aat ata acc gat 1677 
Ser Glu Ser Asn Ala Arg Arg Ser Arg lie Ser Ser Asn lie Thr Asp 
465 470 475 480 

cca tgg aag tct gtg tct gac gag ggt cga att gee ttc caa get etc 1725 
Pro Trp Lys Ser Val Ser Asp Glu Gly Arg lie Ala Phe Gin Ala Leu 
485 490 495 

ttc tec aga gag gta ttg ccg caa agt ttt aca tat cga gaa gaa cac 1773 
Phe Ser Arg Glu Val Leu Pro Gin Ser Phe Thr Tyr Arg Glu Glu His 
500 505 510 

aga gag gaa gaa caa caa caa caa gaa caa aga tat cca atg gca ctt 1821 
Arg Glu Glu Glu Gin Gin Gin Gin Glu Gin Arg Tyr Pro Met Ala Leu 
515 520 525 

gat ctt aac ttc aca get cag tta aca cca gtt gat gat caa gag gag 1869 
Asp Leu Asn Phe Thr Ala Gin Leu Thr Pro Val Asp Asp Gin Glu Glu 
530 535 540 

aag aga aac aca gga ttt ctt gga ate gga tta gat get tea aag eta 1917 
Lys Arg Asn Thr Gly Phe Leu Gly lie Gly Leu Asp Ala Ser Lys Leu 
545 550 555 560 

atg agt aga gga aga aca ggt ttt aaa cca tac aaa aga tgt tec atg 1965 
Met Ser Arg Gly Arg Thr Gly Phe Lys Pro Tyr Lys Arg Cys Ser Met 
565 570 575 

gaa gee aaa gaa agt aga ate etc aac aac aat cct ate att cat gtg 2013 
Glu Ala Lys Glu Ser Arg He Leu Asn Asn Asn Pro He He His Val 
580 585 590 

gaa cag aaa gat ccc aaa egg atg egg ttg gaa act caa get tec aca 2061 
Glu Gin Lys Asp Pro Lys Arg Met Arg Leu Glu Thr Gin Ala Ser Thr 
595 600 605 

tga gactctattt tcatctgatc tgttgtttgt actctgtttt taagttttca 2114 

agaccactgc tacattttct ttttcttttg aggectttgt atttgtttcc ttgtccatag 2174 

tcttcctgta acatttgact ctgtattatt caacaaatca taaactgttt aatctttttt 2234 

tttcca 2240 

<210> 2 
<211> 608 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 2 

Met Glu Thr Asn Ser Ser Gly Glu Asp Leu Val lie Lys Thr Arg Lys 
15 10 15 

Pro Tyr Thr He Thr Lys Gin Arg Glu Arg Trp Thr Glu Glu Glu His 
20 25 30 
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Asn Arg Phe He Glu Ala Leu Arg Leu Tyr Gly Arg Ala Trp Gin Lys 
35 40 45 

He Glu Glu His Val Ala Thr Lya Thr Ala Val Gin He Arg Ser His 
50 55 60 

Ala Gin Lys Phe Phe Ser Lys Val Glu Lys Glu Ala Glu Ala Lys Gly 
65 70 75 80 

Val Ala Met Gly Gin Ala Leu Asp He Ala He Pro Pro Pro Arg Pro 
85 90 95 

Lys Arg Lys Pro Asn Asn Pro Tyr Pro Arg Lys Thr Gly Ser Gly Thr 
100 105 110 

He Leu Met Ser Lys Thr Gly Val Asn Aep Gly Lys Glu Ser Leu Gly 
115 . 120 125 

Ser Glu Lys Val Ser His Pro Glu Met Ala Asn Glu Asp Arg Gin Gin 
130 135 140 

Ser Lys Pro Glu Glu Lys Thr Leu Gin Glu Asp Asn Cys Ser Asp Cys 
145 150 155 160 

Phe Thr His Gin Tyr Leu Ser Ala Ala Ser Ser Met Asn Lys Ser Cys 
165 ' 170 175 

He Glu Thr Ser Asn Ala Ser Thr Phe Arg Glu Phe Leu Pro Ser Arg 
180 185 190 

Glu Glu Gly Ser Gin Asn Asn Arg Val Arg Lys Glu Ser Asn Ser Asp 
195 200 205 

Leu Asn Ala Lys Ser Leu Glu Asn Gly Asn Glu Gin Gly Pro Gin Thr 
210 215 220 

Tyr Pro Met His He Pro Val Leu Val Pro Leu Gly Ser Ser He Thr 
225 230 235 240 

Ser Ser Leu Ser His Pro Pro Ser Glu Pro Asp Ser His Pro His Thr 
245 250 255 

Val Ala Gly Asp Tyr Gin Ser Phe Pro Asn His He Met Ser Thr Leu 
260 265 270 

Leu Gin Thr Pro Ala Leu Tyr Thr Ala Ala Thr Phe Ala Ser Ser Phe 
275 280 285 

Trp Pro Pro Asp Ser Ser Gly Gly Ser Pro Val Pro Gly Asn Ser Pro 
290 295 300 

Pro Asn Leu Ala Ala Met Ala Ala Ala Thr Val Ala Ala Ala Ser Ala 
305 310 315 320 

Trp Trp Ala Ala Asn Gly Leu Leu Pro Leu Cys Ala Pro Leu Ser Ser 
325 330 335 



Page 4 



WO 01/36597 PCT/US00/31344 

MBI-20 Sequence Listing. ST25 

Gly Gly Phe Thr Ser His Pro Pro Ser Thr Phe Gly Pro Ser Cys Asp 
340 345 350 

Val Glu Tyr Thr Lys Ala Ser Thr Leu Gin His Gly Ser Val Gin Ser 
355 360 365 

Arg Glu Gin Glu His Ser Glu Ala Ser Lys Ala Arg Ser Ser Leu Asp 
370 375 380 

Ser Glu Asp Val Glu Asn Lys Ser Lys Pro Val Cys His Glu Gin Pro 
385 390 395 400 

Ser Ala Thr Pro Glu Ser Asp Ala Lys Gly Ser Asp Gly Ala Gly Asp 
405 410 415 

Arg Lys Gin Val Asp Arg Ser Ser Cys Gly Ser Asn Thr Pro Ser Ser 
420 425 430 

Ser Asp Asp Val Glu Ala Asp Ala Ser Glu Arg Gin Glu Asp Gly Thr 
435 440 445 

Asn Gly Glu Val Lys Glu Thr Asn Glu Asp Thr Asn Lys Pro Gin Thr 
450 455 460 

Ser Glu Ser Asn Ala Arg Arg Ser Arg lie Ser Ser Asn He Thr Asp 
465 470 475 480 

Pro Trp Lys Ser Val Ser Asp Glu Gly Arg He Ala Phe Gin Ala Leu 
485 490 495 

Phe Ser Arg Glu Val Leu Pro Gin Ser Phe Thr Tyr Arg Glu Glu His 
500 505 510 

Arg Glu Glu Glu Gin Gin Gin Gin Glu Gin Arg Tyr Pro Met Ala Leu 
515 520 525 

Asp Leu Asn Phe Thr Ala Gin Leu Thr Pro Val Asp Asp Gin Glu Glu 
530 535 540 

Lys Arg Asn Thr Gly Phe Leu Gly He Gly Leu Asp Ala Ser Lys Leu 
545 550 555 560 

Met Ser Arg Gly Arg Thr Gly Phe Lys Pro Tyr Lys Arg Cys Ser Met 
565 570 575 

Glu Ala Lys Glu Ser Arg He Leu Asn Asn Asn Pro He He His Val 
580 585 590 

Glu Gin Lys Asp Pro Lys Arg Met Arg Leu Glu Thr Gin Ala Ser Thr 
595 600 605 

<210> 3 
<211> 916 
<212> DNA 

<213> Arabidopsis thaliana 
<220> 



Page 5 



WO 01/36597 PCT/USO0/31344 

MBI-20 Sequence Listing . ST2S 

<221> CDS 

<222> (88).. (888) 

<223> G231 

<400> 3 

ttccatatct cttccatttc gctctctatt tcacatcccc atataacata atatacaatc 60 

acacatatca tttctatata gtattta atg ggg aga cag cca tgc tgt gac aag 114 

Met Gly Arg Gin Pro Cys Cys Asp Lys 
1 5 

eta ggg gtg aag aaa ggg ccg tgg acg gtg gag gaa gat aag aag ctt 162 
Leu Gly Val Lys Lys Gly Pro Trp Thr Val Glu Glu Asp Lys Lys Leu 
10 15 20 25 

ata aac ttc ata eta acc aat ggc cat tgt tgc tgg cgt get ttg ccg 210 
lie Asn Phe lie Leu Thr Asn Gly His Cys Cys Trp Arg Ala Leu Pro 
30 35 40 

aag ctg gee ggt etc cgt cgc tgt gga aag age tgc cgc etc egg tgg 258 
Lys Leu Ala Gly Leu Arg Arg Cys Gly Lys Ser Cys Arg Leu Arg Trp 
45 50 55 

act aac tat etc egg cct ggc tta aaa cga ggc ctt etc teg cat gat 306 
Thr Asn Tyr Leu Arg Pro Gly Leu Lys Arg Gly Leu Leu Ser His Asp 
60 65 70 

gaa gaa caa ctt gtc ata gat ctt cat get aat etc ggc aat aag tgg 354 
Glu Glu Gin Leu Val lie Asp • Leu His Ala Asn Leu Gly Asn Lys Trp 
75 80 85 

tct aag ata get tea aga tta cct gga aga aca gat aac gaa ata aaa 402 
Ser Lys lie Ala Ser Arg Leu Pro Gly Arg Thr Asp Asn Glu lie Lys 
90 95 100 105 

aac cat tgg aat act cat ate aag aag aaa ctt ctt aag atg gga ate 450 
Asn His Trp Asn Thr His lie Lys Lys Lys Leu Leu Lys Met Gly lie 
110 115 120 

gat cct atg acc cat caa ccc eta aat caa gaa cct tct aat ate gat 498 
Asp Pro Met Thr His Gin Pro Leu Asn Gin Glu Pro Ser Asn lie Asp 
125 130 135 

aat tec aaa acc att ccg tec aat cca gac gat gtc tea gtg gaa cca 546 
Asn Ser Lys Thr He Pro Ser Asn Pro Asp Asp Val Ser Val Glu Pro 
140 145 150 

aag aca act aac acg aaa tac gtg gag ata agt gtc acg aca aca gaa 594 
Lys Thr Thr Asn Thr Lys Tyr Val Glu He Ser Val Thr Thr Thr Glu 
155 160 165 

gaa gaa agt agt age acg gtt act gat caa aac agt teg atg gat aat 642 
Glu Glu Ser Ser Ser Thr Val Thr Asp Gin Asn Ser Ser Met Asp Asn 
170 175 180 185 

gaa aat cat eta att gac aac att tat gat gat gat gaa ttg ttt agt 690 
Glu Asn His Leu He Asp Asn He Tyr Asp Asp Asp Glu Leu Phe Ser 
190 195 200 

tac tta tgg tec gac gaa act act aaa gat gag gee tct tgg agt gat * 73 8 
Tyr Leu Trp Ser Asp Glu Thr Thr Lys Asp Glu Ala Ser Trp Ser Asp 
205 210 215 

agt aac ttt ggt gtt ggt gga aca tta tat gac cac aat ate tec ggc 786 
Ser Asn Phe Gly Val Gly Gly Thr Leu Tyr Asp His Asn He Ser Gly 
220 225 230 

gee gat gca gat ttt ccg ata tgg tea ccg gaa aga ate aat gac gag 834 
Ala Asp Ala Asp Phe Pro He Trp Ser Pro Glu Arg He Asn Asp Glu 
235 240 245 



aag atg ttt ttg gat tat tgt caa gac ttt ggt gtt cat gat ttt ggg 
Lys Met Phe Leu Asp Tyr Cys Gin Asp Phe Gly Val His Asp Phe Gly 
250 255 260 265 



882 
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MBI-20 Sequence Listing. ST25 



ttt tga ctgttcacca ttgacatatt ggcaacgc 
Phe 



916 



<210> 4 
<211> 266 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 4 

Met Gly Arg Gin Pro Cys Cys Asp Lys Leu Gly Val Lys Lys Gly Pro 
15 10 15 



Trp Thr Val Glu Glu Asp Lys Lys Leu lie Asn Phe lie Leu Thr Asn 
20 25 30 



Gly His Cys Cys Trp Arg Ala Leu Pro Lys Leu Ala Gly Leu Arg Arg 
35 40 45 



Cys Gly Lys Ser Cys Arg Leu Arg Trp Thr Asn Tyr Leu Arg Pro Gly 
50 55 60 



Leu Lys Arg Gly Leu Leu Ser His Asp Glu Glu Gin Leu Val lie Asp 
65 70 75 80 



Leu His Ala Asn Leu Gly Asn Lys Trp Ser Lys lie Ala Ser Arg Leu 

85 90 95 



Pro Gly Arg Thr Asp Asn Glu lie Lys Asn His Trp Asn Thr His lie 
100 105 110 



Lys Lys Lys Leu Leu Lys Met Gly lie Asp Pro Met Thr His Gin Pro 
115 120 125 



Leu Asn Gin Glu Pro Ser Asn He Asp Asn Ser Lys Thr He Pro Ser 
130 135 140 



Asn Pro Asp Asp Val Ser Val Glu Pro Lys Thr Thr Asn Thr Lys Tyr 
145 150 155 160 



Val Glu He Ser Val Thr Thr Thr Glu Glu Glu Ser Ser Ser Thr Val 

165 170 175 



Thr Asp Gin Asn Ser Ser Met Asp Asn Glu Asn His Leu He Asp Asn 
180 185 190 



He Tyr Asp Asp Asp Glu Leu Phe Ser Tyr Leu Trp Ser Asp Glu Thr 
195 200 205 



Thr Lys Asp Glu Ala Ser Trp Ser Asp Ser Asn Phe Gly Val Gly Gly 
210 215 220 



Thr Leu Tyr Asp His Asn He Ser Gly Ala Asp Ala Asp Phe Pro He 
225 230 235 240 



Trp Ser Pro Glu Arg He Asn Asp Glu Lys Met Phe Leu Asp Tyr Cys 

245 250 255 
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Gin Asp Phe Gly Val His Asp Phe Gly Phe 
260 265 

<210> 5 
<211> 2371 
<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (172) . . (2037) 

<400> 5 

gacattattt taagtgtgtt ctctctctgt cacactcaca aagctttata ctttctggct 60 

actgcaagct catcagtgaa aagagcttaa accagagaga tctgataaga gaaattttag 120 

agtctctctg cttcaacaag atctacatcg accaggagat tagaaagaat c atg ggt 177 

Met Gly 
1 

tct aag cat aac cca cca ggg aat aac aga teg aga agt aca eta tct 225 
Ser Lys His Asn Pro Pro Gly Asn Asn Arg Ser Arg Ser Thr Leu Ser 
5 10 15 

eta etc gtt gtg gtt ggt tta tgt tgt ttc ttc tat ctt ctt gga gca 273 
Leu Leu Val Val Val Gly Leu Cys Cys Phe Phe Tyr Leu Leu Gly Ala 
20 25 30 

tgg caa aag agt ggg ttt ggt aaa gga gat age ata get atg gag att 321 
Trp Gin Lys Ser Gly Phe Gly Lys Gly Asp Ser lie Ala Met Glu lie 
35 40 45 50 

aca aag caa gcg cag tgt act gac att gtc act gat ctt gat ttt gaa 369 
Thr Lys Gin Ala Gin Cys Thr Asp lie Val Thr Asp Leu Asp Phe Glu 
55 60 65 

cct cat cac aac aca gtg aag ate cca cat aaa get gat ccc aaa cct 417 
Pro His His Asn Thr Val Lys lie Pro His Lys Ala Asp Pro Lys Pro 
70 75 80 

gtt tct ttc aaa ccg tgt gat gtg aag etc aag gat tac acg cct tgt 465 
Val Ser Phe Lys Pro Cys Asp Val Lys Leu Lys Asp Tyr Thr Pro Cys 
85 90 95 

caa gag caa gac cga get atg aag ttc ccg aga gag aac atg att tac 513 
Gin Glu Gin Asp Arg Ala Met Lys Phe Pro Arg Glu Asn Met lie Tyr 
100 105 110 

aga gag aga cat tgt cct cct gat aat gag aag ctg cgt tgt ctt gtt 561 
Arg Glu Arg His Cys Pro Pro Asp Asn Glu Lys Leu Arg Cys Leu Val 
115 120 125 130 

cca get cct aaa ggg tat atg act cct ttc cct tgg cct aaa age aga 609 
Pro Ala Pro Lye Gly Tyr Met Thr Pro Phe Pro Trp Pro Lys Ser Arg 
135 140 145 

gat tat gtt cac tat get aat get cct ttc aag age ttg act gtc gaa 657 
Asp Tyr Val His Tyr Ala Asn Ala Pro Phe Lys Ser Leu Thr Val Glu 
150 155 160 

aaa get gga cag aat tgg gtt cag ttt caa ggg aat gtg ttt aaa ttc 70S 
Lys Ala Gly Gin Asn Trp Val Gin Phe Gin Gly Asn Val Phe Lys Phe 
165 170 175 

cct ggt gga gga act atg ttt cct caa ggt get gat gcg tat att gaa 753 
Pro Gly Gly Gly Thr Met Phe Pro Gin Gly Ala Asp Ala Tyr lie Glu 
180 185 190 

gag eta get tct gtt ate cct ate aaa gat ggc tct gtt aga acc gca 801 
Glu Leu Ala Ser Val lie Pro He Lys Asp Gly Ser Val Arg Thr Ala 
195 200 205 210 
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ttg gac act gga tgt ggg gtt get agt tgg ggt get tat atg ctt aag 849 
Leu Asp Thr Gly Cys Gly Val Ala Ser Trp Gly Ala Tyr Met Leu Lys 
215 220 225 

agg aat gtt ttg act atg teg ttt gcg cca agg gat aac cac gaa gca 897 
Arg Asn Val Leu Thr Met Ser Phe Ala Pro Arg Asp Asn His Glu Ala 
230 235 240 

caa gtc cag ttt gcg ctt gag aga ggt gtt cca gcg att ate get gtt 945 
Gin val Gin Phe Ala Leu Glu Arg Gly Val Pro Ala lie lie Ala Val 
245 250 255 

ctt gga tea ate ctt ctt cct tac cct gca aga gec ttt gac atg get 993 
Leu Gly Ser lie Leu Leu Pro Tyr Pro Ala Arg Ala Phe Asp Met Ala 
260 265 270 

caa tgc tct cga tgc ttg ata cca tgg ace gca aac gag gga aca tac 1041 
Gin Cys Ser Arg Cys Leu He Pro Trp Thr Ala Asn Glu Gly Thr Tyr 
275 280 285 290 

tta atg gaa gta gat aga gtc ttg aga cct gga ggt tac tgg gtc tta 1089 
Leu Met Glu Val Asp Arg Val Leu Arg Pro Gly Gly Tyr Trp Val Leu 
295 300 305 

teg ggt cct cca ate aac tgg aag aca tgg cac aag acg tgg aac cga 113 7 
Ser Gly Pro Pro He Asn Trp Lys Thr Trp His Lys Thr Trp Asn Arg 
310 315 320 

act aaa gca gag eta aat gee gag caa aag aga ata gag gga ate gca 1185 
Thr Lys Ala Glu Leu Asn Ala Glu Gin Lys Arg lie Glu Gly He Ala 
325 330 335 

gag tec tta tgc tgg gag aag aag tat gag aag gga gac att gca att 1233 
Glu Ser Leu Cys Trp Glu Lys Lys Tyr Glu Lys Gly Asp He Ala lie 
340 34S 350 

ttc aga aag aaa ata aac gat aga tea tgc gat aga tea aca ccg gtt 1281 
Phe Arg Lys Lys He Asn Asp Arg Ser Cys Asp Arg Ser Thr Pro Val 
355 360 365 370 

gac acc tgc aaa aga aag gac act gac gat gtc tgg tac aag gag ata 132 9 
Asp Thr Cys Lys Arg Lys Asp Thr Asp Asp Val Trp Tyr Lys Glu lie 
375 380 385 

gaa acg tgt gta aca cca ttc cct aaa gta tea aac gaa gaa gaa gtt 1377 
Glu Thr Cys Val Thr Pro Phe Pro Lys Val Ser Asn Glu Glu Glu Val 
390 395 400 

get gga gga aag eta aag aag ttc ccc gag agg eta ttc gca gtg cct 1425 
Ala Gly Gly Lys Leu Lys Lys Phe Pro Glu Arg Leu Phe Ala Val Pro 
405 410 415 

cca agt ate tct aaa ggt ttg att aat ggc gtc gac gag gaa tea tac 1473 
Pro Ser He Ser Lys Gly Leu He Asn Gly Val Asp Glu Glu Ser Tyr 
420 425 430 

caa gaa gac ate aat eta tgg aag aag cga gtg acc gga tac aag aga 1521 
Gin Glu Asp He Asn Leu Trp Lys Lys Arg Val Thr Gly Tyr Lys Arg 
435 440 445 450 

att aac aga ctg ata ggt tec acc aga tac cgt aat gtg atg gat atg 1569 
He Asn Arg Leu He Gly Ser Thr Arg Tyr Arg Asn Val Met Asp Met 
455 460 465 

aac gee ggt ctt ggt gga ttc get get gcg ctt gaa teg cct aaa teg 1617 
Asn Ala Gly Leu Gly Gly Phe Ala Ala Ala Leu Glu Ser Pro Lys Ser 
470 475 480 

tgg gtt atg aat gtg att cca acc att aac aag aac aca ttg agt gtt 1665 
Trp Val Met Asn Val lie Pro Thr He Asn Lys Asn Thr Leu Ser Val 
485 490 495 

gtt tat gag aga ggt etc att ggt ate tat cat gac tgg tgt gaa ggc 1713 
Val Tyr Glu Arg Gly Leu He Gly He Tyr His Asp Trp Cys Glu Gly 
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500 



505 



MBI-20 Sequence 



ttt tea act tat cca aga aca tac gat ttc att 
Phe Ser Thr Tyr Pro Arg Thr Tyr Asp Phe He 
515 520 525 

ttc age ttg tat cag cac age tgc aaa ctt gag 
Phe Ser Leu Tyr Gin His Ser Cys Lys Leu Glu 
535 540 

act gat egg att tta cga ccg gaa ggg att gtg 
Thr Asp Arg Il.e Leu Arg Pro Glu Gly He Val 
550 555 

gtt gat gtt ttg aat gat gtg agg aag ate gtt 
Val Asp Val Leu Asn Asp Val Arg Lys lie Val 
565 . 570 

gat act aag tta atg gat cat gaa gac ggt cct 
Asp Thr Lys Leu Met Asp His Glu Asp Gly Pro 
580 585 

att ctt gtc gec acg aag cag tat tgg gta gec 
He Leu Val Ala Thr Lys Gin Tyr Trp Val Ala 
595 600 605 



Listing .ST25 
510 

cac get agt ggt gtc 
His Ala Ser Gly Val 
530 

gat att ctt ctt gaa 
Asp He Leu Leu Glu 
545 

att ttc egg gat gag 
He Phe Arg Asp Glu 
560 

gat gga atg aga tgg 
Asp Gly Met Arg Trp 
575 

etc gtg ccg gag aag 
Leu Val Pro Glu Lys 
590 

99c gac gat gga aac 
Gly Asp Asp Gly Asn 
610 



aat tct ccg teg tct tct aat agt gaa gaa gaa taa aacaaaaaca 
Asn Ser Pro Ser Ser Ser Asn Ser Glu Glu Glu 
615 620 



aaaaactcct caggttacta agcttgaagt gtagatctat 
ettatcaaaa aaggaaggaa tcagaatttc cattaaagaa 
aaaactatat agtagtgatc aagacgaata tgtgcattta 
gtttttaatt ttattttttt gaaggaagaa aaaattagtt 
gttgaaacct tggacgcttg ttatgtatga tgcgatcttg 
ttttaaataa atttatgata taaa 



tttacaacat ctggaaaatt 
aggtgtcaaa aaaaagttgt 
tgttttattt ttgttcccta 
ccatgtgttt ttgeaagata 
acatttttta ataacagtta 



1761 

1809 

1857 

1905 

1953 

2001 

2047 

2107 
2167 
2227 
22B7 
2347 
2371 



<210> 6 
<211> 621 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 6 

Met Gly Ser Lys His Asn Pro Pro Gly Asn Asn Arg Ser Arg Ser Thr 
15 10 15 



Leu Ser Leu Leu Val Val Val Gly Leu Cys Cys Phe Phe Tyr Leu Leu 
20 25 30 

Gly Ala Trp Gin Lys Ser Gly Phe Gly Lys Gly Asp Ser He Ala Met 
35 40 45 

Glu He Thr Lys Gin Ala Gin Cys Thr Asp He Val Thr Asp Leu Asp 
50 55 60 

Phe Glu Pro His His Asn Thr Val Lys He Pro His Lys Ala Asp Pro 
65 70 75 80 



Lys Pro Val Ser Phe Lys Pro Cys Asp Val Lys Leu Lys Asp Tyr Thr 
85 90 95 



Pro Cys Gin Glu Gin Asp Arg Ala Met Lys Phe Pro Arg Glu Asn Met 
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MBI-20 Sequence Listing. ST25 
100 105 110 

lie Tyr Arg Glu Arg His Cys Pro Pro Asp Asn Glu Lys Leu Arg Cys 
115 120 125 

Leu Val Pro Ala Pro Lys Gly Tyr Met Thr Pro Phe Pro Trp Pro Lys 
130 135 140 

Ser Arg Asp Tyr Val His Tyr Ala Asn Ala Pro Phe Lys Ser Leu Thr 
145 150 155 160 

Val Glu Lys Ala Gly Gin Asn Trp Val Gin Phe Gin Gly Asn Val Phe 
165 170 175 

Lys Phe Pro Gly Gly Gly Thr Met Phe Pro Gin Gly Ala Asp Ala Tyr 
180 185 190 

lie Glu Glu Leu Ala Ser Val lie Pro lie Lys Asp Gly Ser Val Arg 
195 200 205 

Thr Ala Leu Asp Thr Gly Cys Gly Val Ala Ser Trp Gly Ala Tyr Met 
210 215 220 

Leu Lys Arg Asn Val Leu Thr Met Ser Phe Ala Pro Arg Asp Asn His 
225 230 235 240 

Glu Ala Gin Val Gin Phe Ala Leu Glu Arg Gly Val Pro Ala He He 
245 250 255 

Ala Val Leu Gly Ser lie Leu Leu Pro Tyr Pro Ala Arg Ala Phe Asp 
260 265 270 

Met Ala Gin Cys Ser Arg Cys Leu lie Pro Trp Thr Ala Asn Glu Gly 
275 280 285 

Thr Tyr Leu Met Glu Val Asp Arg Val Leu Arg Pro Gly Gly Tyr Trp 
290 295 300 

Val Leu Ser Gly Pro Pro He Asn Trp Lys Thr Trp His Lys Thr Trp 
305 310 315 320 

Asn Arg Thr Lys Ala Glu Leu Asn Ala Glu Gin Lys Arg He Glu Gly 
325 330 335 

He Ala Glu Ser Leu Cys Trp Glu Lys Lys Tyr Glu Lys Gly Asp He 
340 345 350 

Ala He Phe Arg Lys Lys He Asn Asp Arg Ser Cys Asp Arg Ser Thr 
355 360 365 

Pro Val Asp Thr Cys Lys Arg Lys Asp Thr Asp Asp Val Trp Tyr Lys 
370 375 380 

Glu He Glu Thr Cys Val Thr Pro Phe Pro Lys Val Ser Asn Glu Glu 
385 390 395 400 
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MBI-20 Sequence Listing. ST25 
Glu Val Ala Gly Gly Lys Leu Lys Lys Phe Pro Glu Arg Leu Phe Ala 
405 410 415 

Val Pro Pro Ser He Ser Lys Gly Leu He Asn Gly Val Asp Glu Glu 
420 425 430 

Ser Tyr Gin Glu Asp He Asn Leu Trp Lys Lys Arg Val Thr Gly Tyr 
435 440 445 

Lys Arg He Asn Arg Leu He Gly Ser Thr Arg Tyr Arg Asn val Met 
450 455 460 

Asp Met Asn Ala Gly Leu Gly Gly Phe Ala Ala Ala Leu Glu Ser Pro 
465 470 475 480 

Lys Ser Trp Val Met Asn Val He Pro Thr He Asn Lys Asn Thr Leu 
485 490 495 

Ser Val Val Tyr Glu Arg Gly Leu He Gly He Tyr His Asp Trp Cys 
500 505 510 

Glu Gly Phe Ser Thr Tyr Pro Arg Thr Tyr Asp Phe He His Ala Ser 
515 520 525 

Gly Val Phe Ser Leu Tyr Gin His Ser Cys Lys Leu Glu Asp He Leu 
530 535 540 

Leu Glu Thr Asp Arg He Leu Arg Pro Glu Gly He Val He Phe Arg 
545 550 555 560 

Asp Glu Val Asp Val Leu Asn Asp Val Arg Lys He Val Asp Gly Met 
565 570 575 

Arg Trp Asp Thr Lys Leu Met Asp His Glu Asp Gly Pro Leu Val Pro 
580 585 590 

Glu Lys He Leu Val Ala Thr Lys Gin Tyr Trp Val Ala Gly Asp Asp 
595 600 605 

Gly Asn Asn Ser Pro Ser Ser Ser Asn Ser Glu Glu Glu 
610 615 620 



<210> 


7 


<211> 


1764 


<212> 


DNA 


<213> 


Arabidopsis 


<220> 




<221> 


CDS 


<222> 


(1) . . (1764) 


<223> 


G307 


<400> 


7 



atg aag aga gat cat cac caa ttc caa ggt cga ttg tec aac cac ggg 
Met Lys Arg Asp His His Gin Phe Gin Gly Arg Leu Ser Asn His Gly 
15 10 15 



48 



act tct tct tct tea tea tea ate tct aaa gat aag atg atg atg gtg 96 
Thr Ser Ser Ser Ser Ser Ser He Ser Lys Asp Lys Met Met Met Val 
20 25 30 
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aaa aaa gaa gaa gac ggt gga ggt aac atg gac gac gag ctt etc get 
Lys Lys Glu Glu Asp Gly Gly Gly Asn Met Asp Asp Glu Leu Leu Ala 
35 40 45 

gtt tta ggt tac aaa gtt agg tea teg gag atg gcg gag gtt get ttg 
Val Leu Gly Tyr Lys Val Arg Ser Ser Glu Met Ala Glu Val Ala Leu 
50 55 60 

aaa etc gaa caa tta gag acg atg atg agt aat gtt caa gaa gat ggt 
Lys Leu Glu Gin Leu Glu Thr Met Met Ser Asn Val Gin Glu Asp Gly 
65 70 75 80 

tta tct cat etc gcg acg gat act gtt cat tat aat ccg teg gag ctt 
Leu Ser His Leu Ala Thr Asp Thr Val His Tyr Asn Pro Ser Glu Leu 
85 90 95 

tat tct tgg ctt gat aat atg etc tct gag ctt aat cct cct cct ctt 
Tyr Ser Trp Leu Asp Asn Met Leu Ser Glu Leu Asn Pro Pro Pro Leu 
100 105 110 

ccg gcg agt ,tct aac ggt tta gat ccg gtt ctt cct teg ccg gag att 
Pro Ala Ser Ser Asn Gly Leu Asp Pro Val Leu Pro Ser Pro Glu lie 
115 120 125 

tgt ggt ttt ccg get teg gat tat gac ctt aaa gtc att ccc gga aac 
Cys Gly Phe Pro Ala Ser Asp Tyr Asp Leu Lys Val He Pro Gly Asn 
130 135 140 

gcg att tat cag ttt ccg gcg att gat tct teg tct teg teg aat aat 
Ala He Tyr Gin Phe Pro Ala He Asp Ser Ser Ser Ser Ser Asn Asn 
145 150 155 160 

cag aac aag cgt ttg aaa tea tgc teg agt cct gat tct atg gtt aca 
Gin Asn Lys Arg Leu Lys Ser Cys Ser Ser Pro Asp Ser Met Val Thr 
165 170 175 

teg act teg acg ggt acg cag att ggt gga gtc ata gga acg acg gtg 
Ser Thr Ser Thr Gly Thr Gin He Gly Gly Val He Gly Thr Thr Val 
180 185 190 

acg aca ace ace acg aca acg acg gcg gcg get gag tea act cgt tct 
Thr Thr Thr Thr Thr Thr Thr Thr Ala Ala Ala Glu Ser Thr Arg Ser 
195 200 205 

gtt ate ctg gtt gac teg caa gag aac ggt gtt cgt tta gtc cac gcg 
Val He Leu Val Asp Ser Gin Glu Asn Gly Val Arg Leu Val His Ala 
210 215 220 

ctt atg get tgt gca gaa gca ate cag cag aac aat ttg act eta gcg 
Leu Met Ala Cys Ala Glu Ala He Gin Gin Asn Asn Leu Thr Leu Ala 
225 230 235 240 

gaa get ctt gtg aag caa ate gga tgc tta get gtg tct caa gee gga 
Glu Ala Leu Val Lys Gin He Gly Cys Leu Ala Val Ser Gin Ala Gly 
245 250 255 

get atg aga aaa gtg get act tac ttc gec gaa get tta get egg egg 
Ala Met Arg Lys Val Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg 
260 265 270 

ate tac cgt etc tct ccg ccg cag aat cag ate gat cat tgt etc tec 
He Tyr Arg Leu Ser Pro Pro Gin Asn Gin He Asp His Cys Leu Ser 
275 280 285 

gat act ctt cag atg cac ttt tac gag act tgt cct tat ctt aaa ttc 
Asp Thr Leu Gin Met His Phe Tyr Glu Thr Cys Pro Tyr Leu Lys Phe 
290 295 300 

get cac ttc acg gcg aac caa gcg att etc gaa get ttt gaa ggt aag 
Ala His Phe Thr Ala Asn Gin Ala He Leu Glu Ala Phe Glu Gly Lys 
305 310 315 320 

aag aga gta cac gtc att gat' ttc teg atg aac caa ggt ctt caa tgg 
Lys Arg Val His Val He Asp Phe Ser Met Asn Gin Gly Leu Gin Trp 
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144 
192 
240 
288 
336 
384 
432 
480 
528 
576 
624 
672 
720 
768 
816 
864 
912 
960 
1008 
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MBI-20 Sequence Listing. ST25 
325 330 335 

cct gcg ctt atg caa get ctt gcg ctt cga gaa gga ggt cct cca act 1056 
Pro Ala Leu Met Gin Ala Leu Ala Leu Arg Glu Gly Gly Pro Pro Thr 
340 345 350 

ttc egg tta acc gga att ggt cca ccg gcg ccg gat aat tct gat cat 1104 
Phe Arg Leu Thr Gly lie Gly Pro Pro Ala Pro Asp Asn Ser Asp His 
355 360 365 



ctt cat gaa gtt ggt tgt aaa tta get cag ctt gcg gag gcg att cac 
Leu His Glu Val Gly Cys Lys Leu Ala Gin Leu Ala Glu Ala He His 
370 375 380 



1152 



gta gaa ttc gaa tac cgt gga ttc gtt get aac age tta gee gat etc 1200 
Val Glu Phe Glu Tyr Arg Gly Phe Val Ala Asn Ser Leu Ala Asp Leu 
385 390 395 400 

gat get teg atg ctt gag ctt aga ccg age gat acg gaa get gtt gcg 1248 
Asp Ala Ser Met Leu Glu Leu Arg Pro Ser Asp Thr Glu Ala Val Ala 
405 410 415 

gtg aac tct gtt ttt gag eta cat aag etc tta ggt cgt ccc ggt ggg 1296 
Val Asn Ser Val Phe Glu Leu His Lys Leu Leu Gly Arg Pro Gly Gly 
420 425 430 

ata gag aaa gtt etc ggc gtt gtg aaa cag att aaa ccg gtg att ttc 1344 
He Glu Lys Val Leu Gly Val Val Lys Gin He Lys Pro Val He Phe 
435 440 445 

acg gtg gtt gag caa gaa teg aac cat aac gga ccg gtt ttc tta gac 13 92 
Thr Val Val Glu Gin Glu.Ser Asn His Asn Gly Pro Val Phe Leu Asp 
450 455 460 

egg ttt act gaa teg tta cat tat tat teg act ctg ttt gat teg ttg 1440 
Arg Phe Thr Glu Ser Leu His Tyr Tyr Ser Thr Leu Phe Asp Ser Leu 
465 470 475 480 

gaa gga gtt ccg aat agt caa gac aaa gtc atg tct gaa gtt tac tta 1488 
Glu Gly Val Pro Asn Ser Gin Asp Lys Val Met Ser Glu Val Tyr Leu 
485 490 495 

ggg aaa cag att tgt aat ctg gtg get tgt gaa ggt cct gac aga gtc 1536 
Gly Lys Gin He Cys Asn Leu Val Ala Cys Glu Gly Pro Asp Arg Val 
500 505 510 

gag aga cac gaa acg ttg agt caa tgg gga aac egg ttt ggt teg tec 1584 
Glu Arg His Glu Thr Leu Ser Gin Trp Gly Asn Arg Phe Gly Ser Ser 
515 520 525 

ggt tta gcg ccg gca cat ctt ggg tct aac gcg ttt aag caa gcg agt 1632 
Gly Leu Ala Pro Ala His Leu Gly Ser Asn Ala Phe Lys Gin Ala Ser 
530 535 540 

atg ctt ttg tct gtg ttt aat agt ggc caa ggt tat cgt gtg gag gag 1680 
Met Leu Leu Ser Val Phe Asn Ser Gly Gin Gly Tyr Arg Val Glu Glu 
545 550 555 560 

agt aat gga tgt ttg atg ttg ggt tgg cac act cgc cca etc att acc 1728 
Ser Asn Gly Cys Leu Met Leu Gly Trp His Thr Arg Pro Leu He Thr 
565 570 575 

acc tec get tgg aaa etc teg acg gcg gcg cac tga 1764 
Thr Ser Ala Trp Lys Leu Ser Thr Ala Ala His 
580 585 

<210> 8 
<211> 587 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 8 

Met Lys Arg Asp His His Gin Phe Gin Gly Arg Leu Ser Asn His Gly 
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MBI-20 Sequence Listing. ST25 
15 10 15 

Thr Ser Ser Ser Ser Ser Ser lie Ser Lys Asp Lys Met Met Met Val 
20 25 30 

Lys Lys Glu Glu Asp Gly Gly Gly Asn Met Asp Asp Glu Leu Leu Ala 
35 40 45 

Val Leu Gly Tyr Lys Val Arg Ser Ser Glu Met Ala Glu Val Ala Leu 
50 55 60 

Lys Leu Glu Gin Leu Glu Thr Met Met Ser Asn Val Gin Glu Asp Gly 
65 70 75 80 

Leu Ser His Leu Ala Thr Asp Thr Val His Tyr Asn Pro Ser Glu Leu 
85 90 95 

Tyr Ser Trp Leu Asp Asn Met Leu Ser Glu Leu Asn Pro Pro Pro Leu 
100 105 110 

Pro Ala Ser Ser Asn Gly Leu Asp Pro Val Leu Pro Ser Pro Glu lie 
115 120 125 

Cys Gly Phe Pro Ala Ser Asp Tyr Asp Leu Lys Val lie Pro Gly Asn 
130 135 140 

Ala lie Tyr Gin Phe Pro Ala He Asp Ser Ser Ser Ser Ser Asn Asn 
145 150 155 160 

Gin Asn Lys Arg Leu Lys Ser Cys Ser Ser Pro Asp Ser Met Val Thr 

165 170 175 

Ser Thr Ser Thr Gly Thr Gin He Gly Gly Val He Gly Thr Thr Val 
180 185 190 

Thr Thr Thr Thr Thr Thr Thr Thr Ala Ala Ala Glu Ser Thr Arg Ser 
195 200 205 

Val He Leu Val Asp Ser Gin Glu Asn Gly Val Arg Leu Val His Ala 
210 215 220 

Leu Met Ala Cys Ala Glu Ala He Gin Gin Asn Asn Leu Thr Leu Ala 
225 230 235 240 

Glu Ala Leu Val Lys Gin He Gly Cys Leu Ala Val Ser Gin Ala Gly 
245 250 255 

Ala Met Arg Lys Val Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg 
260 265 270 

He Tyr Arg Leu Ser Pro Pro Gin Asn Gin He Asp His Cys Leu Ser 
275 280 285 

Asp Thr Leu Gin Met His Phe Tyr Glu Thr Cys Pro Tyr Leu Lys Phe 
290 295 300 
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MBI-20 Sequence Listing. ST25 
Ala His Phe Thr Ala- Asn Gin Ala lie Leu Glu Ala Phe Glu Gly Lys 
305 310 315 320 

Lys Arg Val His Val lie Asp Phe Ser Met Asn Gin Gly Leu Gin Trp 
325 330 335 

Pro Ala Leu Met Gin Ala Leu Ala Leu Arg Glu Gly Gly Pro Pro Thr 
340 345 350 

Phe Arg Leu Thr Gly lie Gly Pro Pro Ala Pro Asp Asn Ser Asp His 
355 360 365 

Leu His Glu Val Gly Cys Lys Leu Ala Gin Leu Ala Glu Ala lie His 
370 375 380 

Val Glu Phe Glu Tyr Arg Gly Phe Val Ala Asn Ser Leu Ala Asp Leu 
385 390 395 400 

Asp Ala Ser Met Leu Glu Leu Arg Pro Ser Asp Thr Glu Ala Val Ala 
405 410 415 

Val Asn Ser Val Phe Glu Leu His Lys Leu Leu Gly Arg Pro Gly Gly 
420 425 430 

lie Glu Lys Val Leu Gly Val Val Lys Gin. lie Lys Pro Val lie Phe 
435 440 445 . 

Thr Val Val Glu Gin Glu Ser Asn His Asn Gly Pro Val Phe Leu Asp 
450 455 460 

Arg Phe Thr Glu Ser Leu His Tyr Tyr Ser Thr Leu Phe Asp Ser Leu 
465 470 475 480 

Glu Gly Val Pro Asn Ser Gin Asp Lys Val Met Ser Glu Val Tyr Leu 
485 490 495 

Gly Lys Gin He Cys Asn Leu Val Ala Cys Glu Gly Pro Asp Arg Val 
500 505 510 

Glu Arg His Glu Thr Leu Ser Gin Trp Gly Asn Arg Phe Gly Ser Ser 
515 520 525 

Gly Leu Ala Pro Ala His Leu Gly Ser Asn Ala Phe Lys Gin Ala Ser 
530 535 540 

Met Leu Leu Ser Val Phe Asn Ser Gly Gin Gly. Tyr Arg Val Glu Glu 
545 550 555 560 

Ser Asn Gly Cys Leu Met Leu Gly Trp His Thr Arg Pro Leu He Thr 
565 570 575 

Thr Ser Ala Trp Lys Leu Ser Thr Ala Ala His 
580 585 

<210> 9 
<2H> 825 
<212> DNA 
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MBI-20 Sequence Listing. ST25 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (1)..(825) 

<223> G346 

<400> 9 

atg gaa atg gaa tea ttc atg gac gac ctt ttg aac ttc tct gta ccg 48 
Met Glu Met Glu Ser Phe Met Asp Asp Leu Leu Asn Phe Ser Val Pro 
15 10 15 

gaa gag gaa gaa gac gac gac gaa cat acg caa cca ccg agg aat att 96 
Glu Glu Glu Glu Asp Asp Asp Glu His Thr Gin Pro Pro Arg Asn lie 
20 25 30 

act cgc egg aaa act gga tta egg cca aca gac tec ttc ggt etc ttt 144 
Thr Arg Arg Lys Thr Gly Leu Arg Pro Thr Asp Ser Phe Gly Leu Phe 
35 40 45 

aat acc gac gac ctt gga gtg gtt gaa gaa gag gat ttg gaa tgg att 192 
Asn Thr Asp Asp Leu Gly Val Val Glu Glu Glu Asp Leu Glu Trp lie 
50 55 60 

tea aac aaa aat get ttt ccg gtg att gaa aca ttc gtc ggt gta tta 240 
Ser Asn Lys Asn Ala Phe Pro Val lie Glu Thr Phe Val Gly Val Leu 
65 70 75 80 

ccg teg gag cat ttt cct ata acg tct ctt ctg gaa aga gaa gcg act 288 
Pro Ser Glu His Phe Pro lie Thr Ser Leu Leu Glu Arg Glu Ala Thr 
85 90 95 

gag gta aaa cag ctg agt ccg gtt tea gta ctt gag acg agt age cat 336 
Glu Val Lys Gin Leu Ser Pro Val Ser Val Leu Glu Thr Ser Ser His 
100 105 110 

age tec aca acg act acc tea aac agt age ggc gga agt aac gga age 384 
Ser Ser Thr Thr Thr Thr Ser Asn Ser Ser Gly Gly Ser Asn Gly Ser 
115 120 125 

acg gee gtg get acg acc acc acc act cca aca ata atg age tgt tgc 432 
Thr Ala val Ala Thr Thr Thr Thr Thr Pro Thr He Met Ser Cys Cys 
130 135 140 

gtt ggt ttt aaa gcg ccg get aaa gcg aga age aag cgt cgt cgt aca 480 
Val Gly Phe Lys Ala Pro Ala Lys Ala Arg Ser Lys Arg Arg Arg Thr 
145 150 155 160 

gga cgc cgt gat tta cga gtt ttg tgg aca gga aac gag caa gga gga 528 
Gly Arg Arg Asp Leu Arg Val Leu Trp Thr Gly Asn Glu Gin Gly Gly 
165 170 175 

ata cag aag aag aag acg atg act gtg gcg gcg get gcg ttg att atg 576 
He Gin Lys Lys Lys Thr Met Thr Val Ala Ala Ala Ala Leu He Met 
180 185 190 

gga agg aag tgt caa cac tgt gga gcg gag aag act ccg caa tgg agg 624 
Gly Arg Lys Cys Gin His Cys Gly Ala Glu Lys Thr Pro Gin Trp Arg 
195 200 205 

gca gga cca gcg ggg cct aag act ctg tgt aac get tgt ggc gtg agg 672 
Ala Gly Pro Ala Gly Pro Lys Thr Leu Cys Asn Ala Cys Gly Val Arg 
210 215 220 

tat aag tec ggg agg eta gtt ccg gag tat cgt cca gcg aac agt cca 720 
Tyr Lys Ser Gly Arg Leu Val Pro Glu Tyr Arg Pro Ala Asn Ser Pro 
225 230 235 240 

act ttc acg gcg gag tta cat teg aat tct cac egg aag att gta gag 768 
Thr Phe Thr Ala Glu Leu His Ser Asn Ser His Arg Lys He Val Glu 
245 250 255 

atg agg aag cag tat cag tec ggt gac ggt gac ggt gat egg aaa gat 816 
Met Arg Lys Gin Tyr Gin Ser Gly Asp Gly Asp Gly Asp Arg Lys Asp 
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MBI-20 Sequence Listing. ST25 
260 265 270 

tgt gga taa 825 
Cys Gly 



<210> 10 
<211> 274 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 10 

Met Glu Met Glu Ser Phe Met Asp Asp Leu Leu Asn Phe Ser Val Pro 
15 10 15 

Glu Glu Glu Glu Asp Asp Asp Glu His Thr Gin Pro Pro Arg Asn lie 
20 25 30 

Thr Arg Arg Lys Thr Gly Leu Arg Pro Thr Asp Ser Phe Gly Leu Phe 
35 40 45 

Asn Thr Asp Asp Leu Gly Val Val Glu Glu Glu Asp Leu Glu Trp lie 
50 55 60 

Ser Asn Lys Asn Ala Phe Pro Val He Glu Thr Phe Val Gly Val Leu 
65 70 75 80 

Pro Ser Glu His Phe Pro He Thr Ser Leu Leu Glu Arg Glu Ala Thr 
85 90 95 

Glu Val Lys Gin Leu Ser Pro Val Ser Val Leu Glu Thr Ser Ser His 
100 105 HO 

Ser Ser Thr Thr Thr Thr Ser Asn Ser Ser Gly Gly Ser Asn Gly Ser 
115 120 125 

Thr Ala Val Ala Thr Thr Thr Thr Thr Pro Thr He Met Ser Cys Cys 
130 135 140 

Val Gly Phe Lys Ala Pro Ala Lys Ala Arg Ser Lys Arg Arg Arg Thr 
145 150 155 160 

Gly Arg Arg Asp Leu Arg Val Leu Trp Thr Gly Asn Glu Gin Gly Gly 
165 170 175 

He Gin Lys Lys Lys Thr Met Thr Val Ala Ala Ala Ala Leu He Met 
180 185 190 

Gly Arg LyB Cys Gin His Cys Gly Ala Glu Lys Thr Pro Gin Trp Arg 
195 200 205 

Ala Gly Pro Ala Gly Pro Lys Thr Leu Cys Asn Ala Cys Gly Val Arg 
210 215 220 

Tyr Lys Ser Gly Arg Leu Val Pro Glu Tyr Arg Pro Ala Asn Ser Pro 
225 230 235 240 



Thr Phe Thr Ala Glu Leu His Ser Asn Ser His Arg Lys He Val Glu 
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MBI-20 Sequence Listing. ST25 
245 250 255 



Met Arg Lys Gin Tyr Gin Ser Gly Asp Gly Asp Gly Asp Arg Lys Asp 
260 265 270 



Cys Gly 



<210> 


11 


<211> 


1226 


<212> 


DNA 


<213> 


Arabidopsie thaliana 


<220> 




<221> 


CDS 


<222> 


(248) . . (1039) 


<223> 


G598 


<400> 


11 



gtccgttgtc atattttaaa tttatcacct tcttgagaat 


tccacatttt tatccttttt 


60 


gtcatgtagt gtatattttt tcctctaacc taattaaaat 


caaaacaaaa tcctttgacc 


120 


caattagctt cgcgatatat cagaagagat caaactactt 

* 


tgatcagacc atgatcttct 


180 


tcttcttctt cttcttcttc ttcttctttt tagacgatca 


caattcctaa accctatttc 


240 


tcagatt 


atg 

1 


ctg 
Leu 


act 
inr 


ctt 
Leu 


tac 
Tyr 
5 


cat 
nis 


caa 


gaa 
pi 1, 

Vj-LU 


agg 


tea 

Car 

10 


ccg 


gac 

nop 


gec 

Ala 


aca 

1 11 XT 


289 


agt 
Ser 
15 


aat 
Asn 


gat 
Asp 


cgc 
Arg 


gat 
Asp 


gag 
Glu 
20 


acg 
Thr 


cca 
Pro 


gag 
Glu 


act 
Thr 


gtg 
Val 
25 


gtt 
Val 


aga 
Arg 


gaa 
Glu 


gtc 
Val 


cac 
His 
30 


337 


gcg 
Ala 


eta 
Leu 


act 
Thr 


cca 
Pro 


gcg 
Ala 
35 


ccg 
Pro 


gag 
Glu 


gat 
Asp 


aat 
Asn 


tec 
Ser 

40 


egg 
Arg 


acg 
Thr 


atg 
Met 


acg 
Thr 


gcg 

Ala 
45 


acg 
Thr 


385 


eta 
Leu 


cct 
Pro 


cca 
Pro 


ccg 
Pro 
50 


cct 
Pro 


get 
Ala 


ttc 
Phe 


cga 
Arg 


ggc 
Gly 
55 


tat 
Tyr 


ttt 
Phe 


tct 
Ser 


cct 
Pro 


cca 
Pro 
60 


agg 
Arg 


tea 
Ser 


433 


gcg 
Ala 


acg 
Thr 


acg 
Thr 
65 


atg 
Met 


age 
Ser 


gaa 
Glu 


gga 
Gly 


gag 
Glu 
70 


aac 
Asn 


ttc 
Phe 


aca 
Thr 


act 
Thr 


ata 
lie 
75 


age 
Ser 


aga 
Arg 


gag 
Glu 


481 


ttc 
Phe 


aac 
Asn 
80 


get 
Ala 


eta 
Leu 


gtc 
val 


ate 
He 


gee 
Ala 
85 


gga 
Gly 


tec 
Ser 


tec 
Ser 


atg 
Met 


gag 
Glu 
90 


aac 
Asn 


aac 
Asn 


gaa 
Glu 


eta 
Leu 


529 


atg 
Met 
95 


act 
Thr 


cgt 
Arg 


gac 
Asp 


gtc 
val 


acg 
Thr 
100 


cag 
Gin 


cgt 
Arg 


gaa 
Glu 


gat 
Asp 


gag 
Glu 
105 


aga 
Arg 


caa 
Gin 


gac 

Asp 


gag 
Glu 


ttg 
Leu 
110 


577 


atg 
Met 


aga 
Arg 


ate 
He 


cac 
His 


gag 
Glu 
115 


gac 

Asp 


acg 
Thr 


gat 
ASp 


cat 
His 


gaa 
Glu 
120 


gag 
Glu 


gaa 
Glu 


acg 
Thr 


aat 
Aan 


cct 
Pro 
125 


tta 
Leu 


625 


gca 
Ala 


ate 
He 


gtg 
Val 


ccg 
Pro 
130 


gat 
Asp 


cag 
Gin 


tat 
Tyr 


cct 
Pro 


ggt 
Gly 
135 


teg 
Ser 


ggt 
Gly 


ttg 
Leu 


gat 
Asp 


cct 
Pro 
140 


gga 
Gly 


agt 
Ser 


673 


gat 
Asp 


aat 
Asn 


ggg 

Gly 
145 


ccg 
Pro 


ggt 
Gly 


cag 
Gin 


agt 
Ser 


egg 
Arg 
150 


gtt 
Val 


ggg 

Gly 


teg 
Ser 


acg 
Thr 


gtg 
Val 
155 


caa 
Gin 


aga 
Arg 


gtt 
Val 


721 


aag 
Lys 


agg 
Arg 
160 


gaa 
Glu 


gag 
Glu 


gtg 
Val 


gaa 
Glu 


gcg 
Ala 
165 


aag 
Lys 


ata 
He 


acg 
Thr 


gcg 
Ala 


tgg 
Trp 
170 


cag 
Gin 


acg 
Thr 


gca 
Ala 


aaa 
Lys 


769 
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ctg get aag att aat aac agg ttt aag agg gaa gac gec gtt act aac 817 

Leu Ala Lys lie Asn Asn Arg Phe Lys Arg Glu Asp Ala Val lie Asn 

175 180 185 190 

ggt tgg ttt aat gaa caa gtt aac aag gec aac tct tgg atg aag aaa 865 
Gly Tip Phe Asn Glu Gin Val Asn Lys Ala Asn Ser Trp Met Lys Lys _ 
195 200 205 

att gag tat aat gta ggt tea ttc aac aat cgt eta aat gag gaa get 913 
lie Glu Tyr Asn Val Gly Ser Phe Asn Asn Arg Leu Asn Glu Glu Ala 
210 215 220 

aga gga gag aaa age aaa age gat gga gaa aac gca aaa caa tgt ggc 961 
Arg Gly Glu Lys Ser Lys Ser Asp Gly Glu Asn Ala Lys Gin Cys Gly 
225 230 235 

gaa age gca gag gaa age gga gga gag aag age gac ggc aga ggc aaa 1009 
Glu Ser Ala Glu Glu Ser Gly Gly Glu Lys Ser Asp Gly Arg Gly Lys 
240 245 250 

gag agg gac aga ggt tgc aaa agt agt tga agttgctaat ctcatgagag 1059 
Glu Arg Asp Arg Gly Cys Lys Ser Ser 
255 260 

cccttggacg tcctcctgcc aaacgctcct tcttctcttt ctcctaattt ttagttatat 1119 

caaaccatta aattaaacag tactegttat atatctagtt agtaaacaaa ggggcagttt 1179 

tatagctcat gtacacataa ttgagagtgt agtactgttg tgtcaaa 1226 

<210> 12 
<211> 263 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 12 

Met Leu Thr Leu Tyr His Gin Glu Arg Ser Pro Asp Ala Thr Ser Asn 
15 10 15 

Asp Arg Asp Glu Thr Pro Glu Thr Val Val Arg Glu Val His Ala Leu 
20 25 30 

Thr Pro Ala Pro Glu Asp Asn Ser Arg Thr Met Thr Ala Thr Leu Pro 
35 40 45 

Pro Pro Pro Ala Phe Arg Gly Tyr Phe Ser Pro Pro Arg Ser Ala Thr 
50 55 60 

Thr Met Ser Glu Gly Glu Asn Phe Thr Thr lie Ser Arg Glu Phe Asn 
65 70 7S 80 

Ala Leu Val He Ala Gly Ser Ser Met Glu Asn Asn Glu Leu Met Thr 
85 90 95 

Arg Asp Val Thr Gin Arg Glu Asp Glu Arg Gin Asp Glu Leu Met Arg 
100 105 110 

He His Glu Asp Thr Asp His Glu Glu Glu Thr Asn Pro Leu Ala He 
115 120 125 

Val Pro Asp Gin Tyr Pro Gly Ser Gly Leu Asp Pro Gly Ser Asp Asn 
130 135 140 



Gly Pro Gly Gin Ser Arg Val Gly Ser Thr Val Gin Arg Val Lys Arg 
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145 150 155 160 



Glu Glu Val Glu Ala Lys He Thr Ala Trp Gin Thr Ala Lys Leu Ala 
165 170 175 



Lys He Asn Asn Arg Phe Lys Arg Glu Asp Ala Val He Asn Gly Trp 
180 185 190 



Phe Asn Glu Gin Val Asn Lys Ala Asn Ser Trp Met Lys Lys He Glu 
195 200 205 



Tyr Asn Val Gly Ser Phe Asn Asn Arg Leu Asn Glu Glu Ala Arg Gly 
210 215 220 



Glu Lys Ser Lys Ser Asp Gly Glu Asn Ala Lys Gin Cys Gly Glu Ser 
225 230 235 240 



Ala Glu Glu Ser Gly Gly Glu Lys Ser Asp Gly Arg Gly Lys Glu Arg 
245 250 255 



Asp Arg Gly Cys Lys Ser Ser 





260 


<210> 


13 


<211> 


1263 


<212> 


DMA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(72) . . (1076) 


<223> 


G605 


<400> 


13 



aattccatcc taataatttt caaagcttta attctaagaa ataatatcta caagaaaat* 

ttatctcatg t atg gag act acc gga gaa gtt gtt aaa aca acc acc ggg 
Met Glu Thr Thr Gly Glu Val Val Lys Thr Thr Thr Gly 
15 10 

age gac gga ggc gtt acg gtg gtg aga tec aac gcg ccg tea gac ttc 
Ser Asp Gly Gly Val Thr Val Val Arg Ser Asn Ala Pro Ser Asp Phe 
15 20 25 

cac atg get ccg agg tea gaa act tea aac aca cct ccc aac tec gtc 
His Met Ala Pro Arg Ser Glu Thr Ser Asn Thr Pro Pro Asn Ser Val 
30 35 40 45 

get cct cct cct cct cca ccg ccg caa aac tec ttt act ccg teg gcg 
Ala Pro Pro Pro Pro Pro Pro Pro Gin Asn Ser Phe Thr Pro Ser Ala 
50 55 60 

get atg gat ggt ttc tea age gga ccg ata aag aag aga cgt ggg cgc 
Ala Met Asp Gly Phe Ser Ser Gly Pro He Lys Lys Arg Arg Gly Arg 
65 70 75 

cct agg aag tac gga cac gac gga gca gcg gtg acg eta tct ccg aat 
Pro Arg Lys Tyr Gly His Asp Gly Ala Ala Val Thr Leu Ser Pro Asn 
80 85 90 

ccg ata tea tea gee gca cca acg act tct cac gtc ate gat ttc teg 
Pro He Ser Ser Ala Ala Pro Thr Thr Ser His Val He Asp Phe Ser 
95 100 105 

acg aca teg gag aaa cgt ggc aaa atg aaa cca gca act cca act cca 
Thr Thr Ser Glu Lys Arg Gly Lys Met Lys Pro Ala Thr Pro Thr Pro 
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110 115 120 125 

age tea ttc ate agg cca aag tac cag gtc gag aat tta ggt gaa tgg 494 
Ser Ser Phe lie Arg Pro Lys Tyr Gin Val Glu Asn Leu Gly Glu Trp 
130 135 140 

tct cct tec tct gee gec get aat ttc acg ccg cat att att acg gtg ~" 542 
Ser Pro Ser Ser Ala Ala Ala Asn Phe Thr Pro His lie He Thr Val 
145 150 155 

aat gca ggc gag gac gtt acg aag agg ata ata tea ttt tct caa caa 590 
Asn Ala Gly Glu Asp Val Thr Lys Arg He He Ser Phe Ser Gin Gin 
160 165 170 

ggg tct eta get att tgc gtt tta tgc gca aac ggt gtc gtt teg age 638 
Gly Ser Leu Ala He Cys Val Leu Cys Ala Asn Gly Val Val Ser Ser 
175 180 185 

gtt aca ctt cgt cag cct gat tea tct ggt ggt aca ttg acc tat gag 686 
Val Thr Leu Arg Gin Pro Asp Ser Ser Gly Gly Thr Leu Thr Tyr Glu 
190 195 200 205 

ggt egg ttt gag ata ttg tea eta tct gga aca ttc atg cct agt gac 734 
Gly Arg Phe Glu He Leu Ser Leu Ser Gly Thr Phe Met Pro Ser Asp 
210 215 220 

tea gac ggg aca cga age aga aca ggc ggg atg age gtg teg ctt get 782 
Ser Asp Gly Thr Arg Ser Arg Thr Gly Gly Met Ser Val Ser Leu Ala 
225 230 235 

age cct gat gga cgt gta gta ggt ggt ggt gtt get ggc ttg ctg gtt 830 
Ser Pro Asp Gly Arg Val Val Gly Gly Gly Val Ala Gly Leu Leu Val 
240 245 250 

gca gec act cct att caa gtg gtt gta gga act ttc tta ggt gga aca 878 
Ala Ala Thr Pro He Gin Val val Val Gly Thr Phe Leu Gly Gly Thr 
255 260 265 

aac cag caa gaa cag aca ccg aag ccg cat aac cac aac ttc atg tct 926 
Asn Gin Gin Glu Gin Thr Pro Lys Pro His Asn His Asn Phe Met Ser 
270 275 280 285 

tct cca tta atg cca act tct teg aat gta get gat cat cga acc ate 974 
Ser Pro Leu Met Pro Thr Ser Ser Asn Val Ala Asp His Arg Thr He 
290 295 300 

cgt ccc atg aca tct agt etc ccg ate agt aca tgg aca ccg tct ttt 1022 
Arg Pro Met Thr Ser Ser Leu Pro He Ser Thr Trp Thr Pro Ser Phe 
305 310 315 

cct tct gat tea cga cac aag cat tct cat gac ttt aat ate act ttg 1070 
Pro Ser Asp Ser Arg His Lys His Ser His Asp Phe Asn He Thr Leu 
320 • 325 330 

acg tga tttcttcctt gaagaactcg tagatcctct gtattttggt ttccagttta 1126 
Thr 

gggctctaca tgttagactc tcaaagtcta ggtgttatgt tggtctgtca cttaggattg 1186 

tcacttagga ttgttagacc atctccatca atggtttctc attgagaaac tgttcaatat 1246 

aaaaataaaa tataatc 1263 

<210> 14 

<211> 334 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 14 

Met Glu Thr Thr Gly Glu Val Val Lys Thr Thr Thr Gly Ser Asp Gly 
1 5 10 15 
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Gly Val Thr Val Val Arg Ser Asn Ala Pro Ser Asp Phe His Met Ala 
20 25 30 

Pro Arg Ser Glu Thr Ser Asn Thr Pro Pro Asn Ser Val Ala Pro Pro 
35 40 45 

Pro Pro Pro Pro Pro Gin Asn Ser Phe Thr Pro Ser Ala Ala Met Asp 
50 55 60 

Gly Phe Ser Ser Gly Pro He Lys Lys Arg Arg Gly Arg Pro Arg Lys 
65 70 75 80 

Tyr Gly His Asp Gly Ala Ala Val Thr Leu Ser Pro Asn Pro He Ser 
85 90 95 

Ser Ala Ala Pro Thr Thr Ser His Val He Asp Phe Ser Thr Thr Ser 
100 105 110 

Glu Lys Arg Gly Lys Met Lys Pro Ala Thr Pro Thr Pro Ser Ser Phe 
115 120 125 

He Arg Pro Lys Tyr Gin Val Glu Asn Leu Gly Glu Trp Ser Pro Ser 
130 135 140 

Ser Ala Ala Ala Asn Phe Thr Pro His He He Thr Val Asn Ala Gly 
145 150 155 160 

Glu Asp Val Thr Lys Arg He He Ser Phe Ser Gin Gin Gly Ser Leu 
165 170 175 

Ala He Cys Val Leu Cys Ala Asn Gly Val Val Ser Ser Val Thr Leu 
180 185 190 

Arg Gin Pro Asp Ser Ser Gly Gly Thr Leu Thr Tyr Glu Gly Arg Phe 
195 200 205 

Glu He Leu Ser Leu Ser Gly Thr Phe Met Pro Ser Asp Ser Asp Gly 
210 215 220 

Thr Arg Ser Arg Thr Gly Gly Met Ser Val Ser Leu Ala Ser Pro Asp 
225 230 235 240 

Gly Arg Val Val Gly Gly Gly Val Ala Gly Leu Leu Val Ala Ala Thr 
245 250 255 

Pro He Gin Val Val Val Gly Thr Phe Leu Gly Gly Thr Asn Gin Gin 
260 265 270 

Glu Gin Thr Pro Lys Pro His Asn His Asn Phe Met Ser Ser Pro Leu 
275 280 285 

Met Pro Thr Ser Ser Asn Val Ala Asp His Arg Thr He Arg Pro Met 
290 295 300 

Thr Ser Ser Leu Pro He Ser Thr Trp Thr Pro Ser Phe Pro Ser Asp 
305 310 315 320 
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Ser Arg His Lys His Ser His Asp Phe Asn He Thr Leu Thr 
325 330 



<210> 


15 


<211> 


1057 


<212> 


DNA 


<213> 


Arabidopsis 


<220> 




<22l> 


CDS 


<222> 


(54) . . (914) 


<223> 


G777 


<400> 


15 



gtggctctct ctttatcttt cttggagttt agttagagat tttaacgttg caa atg 

Met 
1 

gat caa cca atg aaa cca aaa act tgc tct gaa tct gat ttt get gat 

Asp Gin Pro Met Lys Pro Lys Thr Cys Ser Glu Ser Asp Phe Ala Asp 
5 10 15 

gat tec tct get tct tct tct tct tct teg gga caa aat etc aga gga 

Asp Ser Ser Ala "Ser Ser Ser Ser Ser Ser Gly Gin Asn Leu Arg Gly 
20 25 30 

get gag atg gtg gtg gaa gtg aag aag gaa gca gtt tgt tec cag aaa 

Ala Glu Met Val Val Glu Val Lys Lys Glu Ala Val Cys Ser Gin Lys 
35 40 45 

gca gag cga gag aag ctt cgt aga gat aag ctt aag gaa cag ttt ctt 

Ala Glu Arg Glu Lys Leu Arg Arg Asp Lys Leu Lys Glu Gin Phe Leu 
50 55 60 65 

gag ctt gga aat gca ctt gat ccg aat agg cct aag agt gac aaa gec 

Glu Leu Gly Asn Ala Leu Asp Pro Asn Arg Pro Lys Ser Asp Lys Ala 

70 75 80 

tea gtt etc act gat aca ata caa atg etc aag gat gta atg aac caa 

Ser val Leu Thr Asp Thr He Gin Met Leu Lys Asp Val Met Asn Gin 
85 90 95 

g.tt gat aga eta aaa get gag tat gaa aca eta tct caa gag tct cgt 

Val Asp Arg Leu Lys Ala Glu Tyr Glu Thr Leu Ser Gin Glu Ser Arg 
100 105 110 

gag eta att caa gag aag agt gag ctg aga gag gag aaa gcg act tta 

Glu Leu He Gin Glu Lys Ser Glu Leu Arg Glu Glu Lys Ala Thr Leu 
115 120 125 

aag tct gat ate gag att ctt aat get caa tat cag cat aga ate aaa 

Lys Ser Asp He Glu He Leu Asn Ala Gin Tyr Gin His Arg He Lys 
130 135 140 145 

acc atg gtt cca tgg gta cct cat tac agt tat cat ate ccc ttc gta 

Thr Met Val Pro Trp Val Pro His Tyr Ser Tyr His He Pro Phe Val 

150 155 160 

gee ata act cag ggt cag tec agt ttt ata cct tat tea gee tct gtc 

Ala He Thr Gin Gly Gin Ser Ser Phe He Pro Tyr Ser Ala Ser Val 
165 170 175 

aat cct eta acc gaa caa caa gca teg gtt cag cag cat tct tct tct 

Asn Pro Leu Thr Glu Gin Gin Ala Ser Val Gin Gin His Ser Ser Ser 
180 185 190 

tct gec gat get tea atg aaa caa gat tec aaa ate aag ccg tta gat 

Ser Ala Asp Ala Ser Met Lys Gin Asp Ser Lys He Lys Pro Leu Asp 
195 200 205 

ttg gat ctg atg atg aac agt aac cat tea ggt caa gga aat gat caa 
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Leu Asp Leu Met Met Asn Ser Asn His Ser Gly Gin Gly Asn Asp Gin 
210 215 220 225 

aaa gat gat gtt cgt tta aag etc gag ctt aaa ate cat gee tct tct 776 
Lys Asp Asp Val Arg Leu Lys Leu Glu Leu Lys lie His Ala Ser Ser 
230 235 240 

tta get caa cag gat gtt tct gga aaa gag aag aaa gta age ttg aca 824 
Leu Ala Gin Gin Asp Val Ser Gly Lys Glu Lys Lys Val Ser Leu Thr 
245 250 255 

acc act gca age tea teg aat agt tac tea tta tct caa get gtt caa 872 
Thr Thr Ala Ser Ser Ser Asn Ser Tyr Ser Leu Ser Gin Ala Val Gin 
260 265 270 

gat agt tec ccc ggt acc gta aat gac atg ttg aag cca taa 914 
Asp Ser Ser Pro Gly Thr Val Asn Asp Met Leu Lys Pro 
275 280 285 

accaataaac atattcccct gaacttgtgt ttaataccgt gattgagaag gtaccatgat 974 

taaacttgtt gtagattatc cacatgatta acgatgtatt cttatcacaa gcaaataaaa 1034 

cacaaaagca tttgcttaaa aaa 1057 

<210> 16 
<211> 286 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 16 

Met Asp Gin Pro Met Lys Pro Lys Thr Cys Ser Glu Ser Asp Phe Ala 
15 10 15 

Asp Asp Ser Ser Ala Ser Ser Ser Ser Ser Ser Gly Gin Asn Leu Arg 
20 25 30 

Gly Ala Glu Met Val Val Glu Val Lys Lys Glu Ala Val Cys Ser Gin 
35 40 45 

Lys Ala Glu Arg Glu Lys Leu Arg Arg Asp Lys Leu Lys Glu Gin Phe 
50 55 60 

Leu Glu Leu Gly Asn Ala Leu Asp Pro Asn Arg Pro Lys Ser Asp Lys 
65 70 75 80 

Ala Ser Val Leu Thr Asp Thr lie Gin Met Leu Lys Asp Val Met Asn 

85 90 95 

Gin Val Asp Arg Leu Lys Ala Glu Tyr Glu Thr Leu Ser Gin Glu Ser 
100 105 110 

Arg Glu Leu lie Gin Glu Lys Ser Glu Leu Arg Glu Glu Lys Ala Thr 
115 120 125 

Leu Lys Ser Asp lie Glu lie Leu Asn Ala Gin Tyr Gin His Arg lie 
130 135 140 

Lys Thr Met Val Pro Trp Val Pro His Tyr Ser Tyr His lie Pro Phe 
145 150 155 160 

Val Ala lie Thr Gin Gly Gin Ser Ser Phe lie Pro Tyr Ser Ala Ser 

165 170 175 
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Val Asn Pro Leu Thr Glu Gin Gin Ala Ser Val Gin Gin His Ser Ser 
180 185 190 



Ser Ser Ala Asp Ala Ser Met Lys Gin Asp Ser Lys lie Lys Pro Leu 

195 200 205 

Asp Leu Asp Leu Met Met Asn Ser Asn His Ser Gly Gin Gly Asn Asp 
210 215 220 

Gin Lys Asp Asp Val Arg Leu Lys Leu Glu Leu Lys lie His Ala Ser 

225 230 235 240 

Ser Leu Ala. Gin Gin Asp Val Ser Gly Lys Glu Lys Lys Val Ser Leu 

245 250 255 

Thr Thr Thr Ala Ser Ser Ser Asn Ser Tyr Ser Leu Ser Gin Ala Val 

260 265 270 

Gin Asp Ser Ser Pro Gly Thr Val Asn Asp Met Leu Lys Pro 

280 285 





275 


<210> 


17 


<211> 


1571 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(428) . . (1402) 


<223> 


G869 


<400> 


17 



3W3 ^jgt ttcgatctga taatcaacaa gaaaaaaggg 60 

tttgatttat gtcggctggg tttgaatcga ctgtgatttt gtctttgatt catatctctt 120 

ctccgatttc atcatcatct tccccatcat cgtcgtcttt gaaatcttgt cttctcaacg 180 

ctcttcactt ctgctgtaat aagcagaggc ttgttctgga gactccttct ctttccatgc 240 

gcttaagacc caaaaggact tgttctagtg ttgaagtctt tgggggtttt cacataaagc 300 

agcaaaagtt ttcttttttc atagttcgct gagagttttg agttttgata ccaaaaaagt 360 

tttgaccttt tagagtgatt ttttgttctt tctgttttct gggtattttt gaggagtggg 420 

tttaaca atg gtt gcg att aga aag gaa cag tct ttg agt ggt gtt agt 469 

Met Val Ala lie Arg Lys Glu Gin Ser Leu Ser Gly Val Ser 

15 10 

age gag att aag aag aga get aag aga aac act eta teg tec ctt cct 517 

Ser Glu lie Lys Lys Arg Ala Lys Arg Asn Thr Leu Ser Ser Leu Pro 
15 20 25 30 

caa gaa acc caa cct ttg agg aaa gtc cgt att att gtg aat gat cct 565 

Gin Glu Thr Gin Pro Leu Arg Lys Val Arg He He Val Asn Asp Pro 
35 40 45 

tat get act gat gat tec tct agt gat gag gaa gag ctt aag gtt cct 613 

Tyr Ala Thr Asp Asp Ser Ser Ser Asp Glu Glu Glu Leu Lys Val Pro 
50 55 60 



aag cca agg aaa atg aaa cgt ate gtt cgt gag att aac ttt cct tct 
Lys Pro Arg Lys Met Lys Arg He Val Arg Glu He Asn Phe Pro Ser 
65 70 75 
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atg gaa gtt tct gaa cag cct tct gag agt tct tct cag gac agt act 709 
Met Glu Val Ser Glu Gin Pro Ser Glu Ser Ser Ser Gin Asp Ser Thr 
80 85 90 

aaa act gat ggc aag ata get gtg tea get tct cct get gtt cct agg 757 
Lys Thr Asp Gly Lys lie Ala Val Ser Ala Ser Pro Ala Val Pro Arg - 
95 100 105 110 

aag aag cct gtt ggt gtt agg caa agg aaa tgg ggg aaa tgg get get 805 
Lys Lys Pro Val Gly Val Arg Gin Arg Lys Trp Gly Lys Trp Ala Ala 
115. 120 125 

gag att aga gat cct att aag aaa act agg act tgg ttg ggt act ttt 853 
Glu lie Arg Asp Pro lie Lys Lys Thr Arg Thr Trp Leu Gly Thr Phe 
130 135 140 

gat act ctt gaa gaa get get aaa get tat gat get aag aag ctt gag 901 
Asp Thr Leu Glu Glu Ala Ala Lys Ala Tyr Asp Ala Lys Lys Leu Glu 
145 150 155 

ttt gat get att gtt get gga aat gtg tec act act aaa cgt gat gtt 949 
Phe Asp Ala lie Val Ala Gly Asn Val Ser Thr Thr Lys Arg Asp Val 
160 165 170 



tct tea tct gag act age caa tgc tct cgt tct tea cct gtt gtt cct 
Ser Ser Ser Glu Thr Ser Gin Cys Ser Arg Ser Ser Pro Val Val Pro 
175 180 185 190 



997 



gtt gag caa gat gac act tct gca tea get etc act tgt gtc aac aac 
Val Glu Gin Asp Asp Thr Ser Ala Ser Ala Leu Thr Cys Val Asn Asn 
195 200 205 



1045 



cct gat gac gtc teg acc gtt get cca act get cca act cca aat gtt 
Pro Asp Asp Val Ser Thr Val Ala Pro Thr Ala Pro Thr Pro Asn Val 
210 215 220 



1093 



cct get ggt gga aac aag gaa acg ttg ttc gat ttc gac ttt act aat 
Pro Ala Gly Gly Asn Lys Glu Thr Leu Phe Asp Phe Asp Phe Thr Asn 
225 230 235 



1141 



eta cag ate cct gat ttt ggt ttc ttg gca gag gag caa caa gac eta 1189 
Leu Gin lie Pro Asp Phe Gly Phe Leu Ala Glu Glu Gin Gin Asp Leu 
240 245 250 

gac ttc gat tgt ttc etc gcg gat gat cag ttt gat gat ttc ggc ttg 1237 
Asp Phe Asp Cys Phe Leu Ala Asp Asp Gin Phe Asp Asp Phe Gly Leu 
255 260 265 270 

ctt gat gac att caa gga ttc gaa gat aac ggt cca agt gcg tta cca 1285 
Leu Asp Asp lie Gin Gly Phe Glu Asp Asn Gly Pro Ser Ala Leu Pro 
275 280 285 

gat ttc gac ttt gcg gat gtt gaa gat ctt cag eta get gac tct agt 1333 
Asp Phe Asp Phe Ala Asp Val Glu Asp Leu Gin Leu Ala Asp Ser Ser 
290 295 300 

ttc ggt ttc ctt gat caa ctt get cct ate aac ate tct tgc cca tta 1381 
Phe Gly Phe Leu Asp Gin Leu Ala Pro lie Asn lie Ser Cys Pro Leu 
305 310 315 

aaa agt ttt gca get tea tag gatcttgett agtaatgtta agtgagaaga 1432 
Lys Ser Phe Ala Ala Ser 
320 

gtgttttgtt ttttcgttta tgctttagta atttaagaca tacaaaagtg tgtgttccgg 1492 

attgtagtaa gatcttaaga cataaagecg ggttttgcaa ttaggaatcg agttttaatg 1552 

aagttttagt ttatgtttg 1571 



<210> 18 
<211> 324 
<212> PRT 
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<213> Arabidopsis thaliana 
<400> 18 

Met Val Ala lie Arg Lys Glu Gin Ser Leu Ser Gly Val Ser Ser Glu 
15 10 15 

He Lys Lys Arg Ala Lys Arg Asn Thr Leu Ser Ser Leu Pro Gin Glu 
20 25 30 

Thr Gin Pro Leu Arg Lys Val Arg He He Val Asn Asp Pro Tyr Ala 
35 40 45 

Thr Asp Asp Ser Ser Ser Asp Glu Glu Glu Leu Lys Val Pro Lys Pro 
50 55. 60 

Arg Lys Met Lys Arg He Val Arg Glu lie Asn Phe Pro Ser Met Glu 
65 70 75 80 

Val Ser Glu Gin Pro Ser Glu Ser Ser Ser Gin Asp Ser Thr Lys Thr 
85 90 95 

Asp Gly Lys lie Ala Val Ser Ala Ser Pro Ala Val Pro Arg Lys Lys 
100 105 HO 

Pro Val Gly Val Arg Gin Arg Lys Trp Gly Lys Trp Ala Ala Glu He 
115 120 125 

Arg Asp Pro He Lys Lys Thr Arg Thr Trp Leu Gly Thr Phe Asp Thr 
130 135 140 

Leu Glu Glu Ala Ala Lys Ala Tyr Asp Ala Lys Lys Leu Glu Phe Asp 
145 150 155 160 

Ala He Val Ala Gly Asn Val Ser Thr Thr Lys Arg Asp Val Ser Ser 
165 170 175 

Ser Glu Thr Ser Gin Cys Ser Arg Ser Ser Pro Val Val Pro Val Glu 
180 185 190 

Gin Asp Asp Thr Ser Ala Ser Ala Leu Thr Cys Val Asn Asn Pro Asp 
195 200 205 

Asp Val Ser Thr Val Ala Pro Thr Ala Pro Thr Pro Asn Val Pro Ala 
210 215 220 

Gly Gly Asn Lys Glu Thr Leu Phe Asp Phe Asp Phe Thr Asn Leu Gin 
225 230 235 240 

He Pro Asp Phe Gly Phe Leu Ala Glu Glu Gin Gin Asp Leu Asp Phe 
245 250 255 

Asp Cys Phe Leu Ala Asp Asp Gin Phe Asp Asp Phe Gly Leu Leu Asp 
260 265 270 

Asp He Gin Gly Phe Glu Asp Asn Gly Pro Ser Ala Leu Pro Asp Phe 
275 280 285 
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Asp Phe Ala Asp Val Glu Asp Leu Gin Leu Ala Asp Ser Ser Phe Gly 
290 295 300 

Phe Leu Asp Gin Leu Ala Pro He Asn He Ser Cys Pro Leu Lys Ser 
305 310 315 320 



Phe Ala Ala Ser 



<210> 


19 


<211> 


1322 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(104) . . (1084) 


<223> 


G1133 


<400> 


19 



ttcaagaaag aatcaccaag tgttgcgttc cacacatttg agcaacagct tccacaatcg 60 

tattgtattc ctgtaaagtt cccttggctt aaactgcaag age atg cct ctt gat 115 

Met Pro Leu Asp 
1 

acc aaa cag cag aaa tgg ttg cca tta ggc tta aat cct caa get tgt 163 
Thr Lys Gin Gin Lys Trp Leu Pro Leu Gly Leu Asn Pro Gin Ala Cys 
5 10 15 20 



gtc cag gac aag gcg act gag tat ttc cgt cct gga att cct ttt ccg 
Val Gin Asp Lys Ala Thr Glu Tyr Phe Arg Pro Gly He Pro Phe Pro 

25 30 35 

gaa etc ggt aaa gtt tat gca get gag cat cag ttt cgc tat ttg cag 
Glu Leu Gly Lys Val Tyr Ala Ala Glu His Gin Phe Arg Tyr Leu Gin 
40 45 SO 

cca ccg ttc caa gee tta ttg tct aga tat gat cag cag tct tgt gga 
Pro Pro Phe Gin Ala Leu Leu Ser Arg Tyr Asp Gin Gin Ser Cys Gly 
55 60 65 



gag ggg gca etc aag tct tct egg aaa aga ttt ata gta ttc gat cag 
Glu Gly Ala Leu Lys Ser Ser Arg Lys Arg Phe He Val Phe Asp Gin 
85 90 95 100 



cct tct tct atg gat gca gag cga ggg aac att etc ggt gee eta cac 
Pro Ser Ser Met Asp Ala Glu Arg Gly Asn He Leu Gly Ala Leu His 
120 125 130 



caa cat gaa gat cat gaa aat ggc gaa gaa gac teg gaa atg cac gaa 
Gin His Glu Asp His Glu Asn Gly Glu Glu Asp Ser Glu Met His Glu 
150 155 160 

gac act gag gaa ate aac gcg tta ctg tat tct gat gat gac gat aat 
Asp Thr Glu Glu He Asn Ala Leu Leu Tyr Ser Asp Asp Asp Asp Asn 
165 170 175 180 



211 



259 



307 



aaa caa gtt tea tgt ttg aat ggg cga tct age aac ggt get get cca 355 
Lys Gin Val Ser Cys Leu Asn Gly Arg Ser Ser Asn Gly Ala Ala Pro 
70 75 80 



403 



teg gga gag cag act cgt ttg tta caa tgt gga ttt cct ctg egg ttt 451 
Ser Gly Glu Gin Thr Arg Leu Leu Gin Cys Gly Phe Pro Leu Arg Phe 
105 110 115 



499 



cca gag aaa ggg ttt agt aaa gat cat gee att caa gaa aag ata ttg 547 
Pro Glu Lys Gly Phe Ser Lys Asp His Ala lie Gin Glu Lys He Leu 
135 140 145 



595 



643 
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gat gat tgg gaa agt gat gat gaa gta atg age act ggt cac tct cca 691 
Asp Asp Trp Glu Ser Asp Asp Glu Val Met Ser Thr Gly His Ser Pro 

185 190 195 

ttc aca gtt gaa caa caa gcg tgc aac ata aca aca gaa gag ctg gat 739 
Phe Thr Val Glu Gin Gin Ala Cys Asn lie Thr Thr Glu Glu Leu Asp 
200 205 210 

gaa act gaa age act gtt gat ggt cca ctt ctt aaa aga cag aaa eta 787 
Glu Thr Glu Ser Thr Val Asp Gly Pro Leu Leu Lys Arg Gin Lys Leu 
215 220 225 

ctg gac cat teg tac aga gac tea tea cca tec ctt gtg ggc acc act 835 
Leu Asp His Ser Tyr Arg Asp Ser Ser Pro Ser Leu Val Gly Thr Thr 
230 235 240 

aaa gtc aaa ggc tta tea gat gaa aac ctt cct gaa tea aac att tea 883 
Lys Val Lys Gly Leu Ser Asp Glu Asn Leu Pro Glu Ser Asn lie Ser 
245 250 255 260 

age aaa caa gaa acg ggt tct ggt ttg age gac gag cag tea aga aaa 931 
Ser Lys Gin Glu Thr Gly Ser Gly Leu Ser Asp Glu Gin Ser Arg Lys 
265 270 275 

gac aag att cac acc get ctg aga ate ctg gag agt gta gtt cca ggg 979 
Asp Lys lie His Thr Ala Leu Arg lie Leu Glu Ser Val Val Pro Gly 
280 285 290 

gca aag gga aaa gaa get ctt tta eta eta gac gaa gee att gat tac 1027 
Ala Lys Gly Lys Glu Ala Leu Leu Leu Leu Asp Glu Ala lie Asp Tyr 
295 300 305 

etc aag ttg ctg aag caa age tta aac tea tea aag ggt ttg aat aac 1075 
Leu Lys Leu Leu Lys Gin Ser Leu Asn Ser Ser Lys Gly Leu Asn Asn 
310 315 320 

cat tgg tga aaaacctaca accccttttg tcctattgat aaggcatgtt 1124 

His Trp 

325 

tggttggtta aagagaagac atgggacaaa agataatcaa tgaggtaaag gactgatgaa 1184 

gaagattctc tcaaattcat taacgtgggt ttgaaacaat tagaacaege ctggtgaccc 1244 

tagtgggacc gtatccactg ttcatctagc tggatcaata gtggtttact tttggatttg 1304 

gcatgctctc tcaaaaaa 1322 

<210> 20 
<211> 326 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 20 

Met Pro Leu Asp Thr Lys Gin Gin Lys Trp Leu Pro Leu Gly Leu Asn 
15 10 15 

Pro Gin Ala Cys Val Gin Asp Lys Ala Thr Glu Tyr Phe Arg Pro Gly 
20 25 30 

lie Pro Phe Pro Glu Leu Gly Lys Val Tyr Ala Ala Glu His Gin Phe 
35 40 45 

Arg Tyr Leu Gin Pro Pro Phe Gin Ala Leu Leu Ser Arg Tyr Asp Gin 
50 55 60 

Gin Ser Cys Gly Lys Gin Val Ser Cys Leu Asn Gly Arg Ser Ser Asn 
65 70 75 80 
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Gly Ala Ala Pro Glu Gly Ala Leu Lys Ser Ser Arg Lys Arg Phe He 
85 90 95 

Val Phe Asp Gin Ser Gly Glu Gin Thr Arg Leu Leu Gin Cys Gly Phe 
100 105 110 

Pro Leu Arg Phe Pro Ser Ser Met Asp Ala Glu Arg Gly Asn He Leu 
115 120 125 

Gly Ala Leu His Pro Glu Lys Gly Phe Ser Lys Asp His Ala He Gin 
130 135 140 

Glu Lys He Leu Gin His Glu Asp His Glu Asn Gly Glu Glu Asp Ser 
145 150 ' - 155 160 

Glu Met His Glu Asp Thr Glu Glu He Asn Ala Leu Leu Tyr Ser Asp 
165 170 175 

Asp Asp Asp Asn Asp Asp Trp Glu Ser Asp Asp Glu Val Met Ser Thr 
180 185 190 

Gly His Ser Pro Phe Thr Val Glu Gin Gin Ala Cys Asn He Thr Thr 
195 200 205 

Glu Glu Leu Asp Glu Thr Glu Ser Thr Val Asp Gly Pro Leu Leu Lys 
210 215 220 

Arg Gin Lys Leu Leu Asp His Ser Tyr Arg Asp Ser Ser Pro Ser Leu 
225 230 235 240 

Val Gly Thr Thr Lys Val Lys Gly Leu Ser Asp Glu Asn Leu Pro Glu 
245 . 250 255 

Ser Asn He Ser Ser Lys Gin Glu Thr Gly Ser Gly Leu Ser Asp Glu 
260 265 270 

Gin Ser Arg Lys Asp Lys He His Thr Ala Leu Arg He Leu Glu Ser 
275 280 285 

Val Val Pro Gly Ala Lys Gly Lys Glu Ala Leu Leu Leu Leu Asp Glu 
290 295 300 

Ala He Asp Tyr Leu Lys Leu Leu Lys Gin Ser Leu Asn Ser Ser Lys 
305 310 315 320 

Gly Leu Asn Asn His Trp 
325 



<210> 


21 


<211> 


859 


<212> 


DNA 


<213> 


Arabidopsis 


<220> 




<221> 


CDS 


<222> 


(62) . . (718) 


<223> 


G1266 
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<400> 21 

caatccacta acgatcccta accgaaaaca gagtagtcaa gaaacagagt attttttcta 60 

c atg gat cca ttt tta att cag tec cca ttc tec ggc ttc tea ccg gaa 109 
Met Asp Pro Phe Leu lie Gin Ser Pro Phe Ser Gly Phe Ser Pro Glu 
15 10 15 

tat tct ate gga tct tct cca gat tct ttc tea tec tct tct tct aac 157 
Tyr Ser lie Gly Ser Ser Pro Asp Ser Phe Ser Ser Ser Ser Ser Asn 
20 25 30 

aat tac tct ctt ccc ttc aac gag aac gac tea gag gaa atg ttt etc 205 
Asn Tyr Ser Leu Pro Phe Asn Glu Asn Asp Ser Glu Glu Met Phe Leu 
35 40 45 

tac ggt eta ate gag cag tec acg caa caa ace tat att gac teg gat 253 
Tyr Gly Leu lie Glu Gin Ser Thr Gin Gin Thr Tyr lie Asp Ser Asp 
50 55 60 

agt caa gac ctt ccg ate aaa tec gta age tea aga aag tea gag aag 301 
Ser Gin Asp Leu Pro lie Lys Ser Val Ser Ser Arg Lys Ser Glu Lys 
65 70 75 80 

tct tac aga ggc gta aga cga egg cca tgg ggg aaa ttc gcg gcg gag 349 
Ser Tyr Arg Gly Val Arg Arg Arg Pro Trp Gly Lys Phe Ala Ala Glu 
85 90 95 

ata aga gat teg act aga aac ggt att agg gtt tgg etc ggg acg ttc 3 97 

lie Arg Asp Ser Thr Arg Asn Gly He Arg Val Trp Leu Gly Thr Phe 
100 105 110 

gaa age gcg gaa gag gcg get tta gee tac gat caa get get ttc teg 445 
Glu Ser Ala Glu Glu Ala Ala Leu Ala Tyr Asp Gin Ala Ala Phe Ser 
115 120 125 

atg aga ggg tec teg gcg att etc aat ttt teg gcg gag aga gtt caa 493 
Met Arg Gly Ser Ser Ala He Leu Asn Phe Ser Ala Glu Arg Val Gin 
130 135 140 

gag teg ctt teg gag att aaa tat acc tac gag gat ggt tgt tct ccg 541 
Glu Ser Leu Ser Glu He Lys Tyr Thr Tyr Glu Asp Gly Cys Ser Pro 
145 150 155 160 

gtt gtg gcg ttg aag agg aaa cac teg atg aga egg aga atg acc aat 589 
Val Val Ala Leu Lys Arg Lys His Ser Met Arg Arg Arg Met Thr Asn 

165 170 175 

aag aag acg aaa gat agt gac ttt gat cac cgc tec gtg aag tta gat 637 
Lys Lys Thr Lys Asp Ser Asp Phe Asp His Arg Ser Val Lys Leu Asp 
180 185 190 

aat gta gtt gtc ttt gag gat ttg gga gaa cag tac ctt gag gag ctt 685 
Asn Val Val Val Phe Glu Asp Leu Gly Glu Gin Tyr Leu Glu Glu Leu 
195 200 205 

ttg ggg tct tct gaa aat agt ggg act tgg tga aagattagga tttgtattag 73 8 
Leu Gly Ser Ser Glu Asn Ser Gly Thr Trp 
210 215 

ggaccttaag tttgaagtgg ttgattaatt ttaaccctaa tatgtttttt gtttgcttaa 798 

atatttgatt ctattgagaa acatcgaaaa cagtttgtat gtacttttgt gatacttggc 858 

g 859 

<210> 22 
<211> 218 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 22 

Met Asp Pro Phe Leu He Gin Ser Pro Phe Ser Gly Phe Ser Pro Glu 
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15 10 15 

Tyr Ser lie Gly Ser Ser Pro Asp Ser Phe Ser Ser Ser Ser Ser Asn 
20 25 30 

Asn Tyr Ser Leu Pro Phe Asn Glu Asn Asp Ser Glu Glu Met Phe Leu 
35 40 45 

Tyr Gly Leu lie Glu Gin Ser Thr Gin Gin Thr Tyr He Asp Ser Asp 
50 55 60 

Ser Gin Asp Leu Pro lie Lys Ser Val Ser Ser Arg Lys Ser Glu Lys 
65 70 75 80 

Ser Tyr Arg Gly Val Arg Arg Arg Pro Trp Gly Lys Phe Ala Ala Glu 
85 90 95 

He Arg Asp Ser Thr Arg Asn Gly He Arg Val Trp Leu Gly Thr Phe 
100 105 110 

Glu Ser Ala Glu Glu Ala Ala Leu Ala Tyr Asp Gin Ala Ala Phe Ser 
115 120 125 

Met Arg Gly Ser Ser Ala lie Leu Asn Phe Ser Ala Glu Arg Val Gin 
130 135 140 

Glu Ser Leu Ser Glu He Lys Tyr Thr Tyr Glu Asp Gly Cys Ser Pro 
145 150 155 160 

Val Val Ala Leu Lys Arg Lys His Ser Met Arg Arg Arg Met Thr Asn 
165 170 175 

Lys Lys Thr Lys Asp Ser Asp Phe Asp His Arg Ser Val Lys Leu Asp 
180 185 190 

Asn Val Val Val Phe Glu Asp Leu Gly Glu Gin Tyr Leu Glu Glu Leu 
195 200 205 

Leu Gly Ser Ser Glu Asn Ser Gly Thr Trp 
210 215 



<210> 


23 


<211> 


1137 


<212> 


DNA 


<213> 


Arabidopsis 


<220> 




<221> 


CDS 


<222> 


(54) . . (914) 


<223> 


G1324 


<400> 


23 



cgaaaacacc acaaaccaaa tatcattaag taattaggaa acttaaacta agt atg 56 

Met 
1 

gaa aat teg atg aag aag aag aag age ttc aaa gaa agt gaa gat gaa 104 
Glu Asn Ser Met Lys Lys Lys Lys Ser Phe Lys Glu Ser Glu Asp Glu 
5 10 15 
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gaa eta aga aga ggg cct tgg act ttg gag gaa gac aca ctt etc aca 152 
Glu Leu Arg Arg Gly Pro Trp Thr Leu Glu Glu Asp Thr Leu Leu Thr 
20 25 30 

aat tac ate etc cat aac ggt gag ggt cgt tgg aat cac gtc gec aaa 200 
Asn Tyr lie Leu His Asn Gly Glu Gly Arg Trp Asn His Val Ala Lys 
35 40 45 

tgt get ggg eta aag aga act ggg aaa agt tgt aga ttg aga tgg ttg 248 
Cys Ala Gly Leu Lys Arg Thr Gly Lys Ser Cys Arg Leu Arg Trp Leu 
50 55 60 65 

aat tac ttg aaa ccc gac ata aga cga ggg aat ctt act cct caa gaa 296 
Asn Tyr Leu Lys Pro Asp lie Arg Arg Gly Asn Leu Thr Pro Gin Glu 
70 75 80 

cag ctt ttg ate ctt gag ctt cac tct aaa tgg ggt aat agg tgg tec 344 
Gin Leu Leu lie Leu Glu Leu His Ser Lys Trp Gly Asn Arg Trp Ser 
85 90 95 

aag att gca cag tac ttg cca gga aga acg gat aac gag ate aag aac 3 92 

Lys lie Ala Gin Tyr Leu Pro Gly Arg Thr Asp Asn Glu lie Lys Asn 
100 105 110 

tat tgg aga aca aga gtt caa aaa caa get cgt caa etc aac ate gaa 440 
Tyr Trp Arg Thr Arg Val Gin Lys Gin Ala Arg Gin Leu Asn lie Glu 
115 120 125 

tct aac age gac aag ttc ttt gac get gtt cgt agt ttt tgg gtc cct 488 
Ser Asn Ser Asp Lys Phe Phe Asp Ala Val Arg Ser Phe Trp Val Pro 
130 135 140 145 

aga ttg ate gag aag atg gaa caa aac tea tec act act act act tat 536 
Arg Leu He Glu Lys Met Glu Gin Asn Ser Ser Thr Thr Thr Thr Tyr 
150 155 160 

tgt tgt ccc caa aac aac aac aac aac tct ctt ctt ctt cct tct caa 584 
Cys Cys Pro Gin Asn Asn Asn Asn Asn Ser Leu Leu Leu Pro Ser Gin 
165 170 175 

tct cac gac tct tta agt atg caa aaa gat ata gat tac teg ggt ttc 632 
Ser His Asp Ser Leu Ser Met Gin Lys Asp He Asp Tyr Ser Gly Phe 
180 185 190 

age aac ata gac ggt tct tct tea act tct act tgc atg tct cat eta 680 
Ser Asn He Asp Gly Ser Ser Ser Thr Ser Thr Cys Met Ser His Leu 
195 200 205 

aca aca gtt cca cac ttt atg gat caa age aac ace aat ate ate gat 728 
Thr Thr Val Pro His Phe Met Asp Gin Ser Asn Thr Asn He He Asp 
210 215 220 225 

ggc teg atg tgt ttc cat gaa ggc aat gtt caa gaa ttc gga gga tat 776 
Gly Ser Met Cys Phe His Glu Gly Asn Val Gin Glu Phe Gly Gly Tyr 
230 235 240 

gtt cct ggc atg gag gat tac atg gta aac teg gac ate tea atg gaa 824 
Val Pro Gly Met Glu Asp Tyr Met Val Asn Ser Asp He Ser Met Glu 
245 250 255 

tgt cac gtg gcg gat ggt tat tea gcg tac gag gat gtt aca caa gat 872 
Cys His Val Ala Asp Gly Tyr Ser Ala Tyr Glu Asp Val Thr Gin Asp 
260 265 270 

ccc atg tgg aat gtg gat gac att tgg cag ttt agg gag taa 914 
Pro Met Trp Asn Val Asp Asp He Trp Gin Phe Arg Glu 
275 280 285 

ttaagtcgtc aagagatgag atggtagagc ctaccactac ggttctatta tatggactaa 974 

tatacttctt ttgettaact aagcaaaaag tttcgaacct tttacccata ttatctcggg 1034 

ttggagacta gaacatgtta aatttgtatc ttctttgttg cgagtactta ctaagtcatt 1094 

ggataaatat ttataatgat agtttcttgt acaaaaaaaa aaa 1137 
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<210> 24 
<211> 286 
<212> PRT 

<212> Arabidopsis thaliana 
<400> 24 

Met Glu Asn Ser Met Lys Lys Lys Lya Ser Phe Lys Glu Ser Glu Asp 
15 10 15 

Glu Glu Leu Arg Arg Gly Pro Trp Thr Leu Glu Glu Asp Thr Leu Leu 
20 25 30 

Thr Asn Tyr lie Leu His Asn Gly Glu Gly Arg Trp Asn His Val Ala 
35 40 45 

Lys Cys Ala Gly Leu Lys Arg Thr Gly Lys Ser Cys Arg Leu Arg Trp 
50 55 60 

Leu Asn Tyr Leu Lys Pro Asp lie Arg Arg Gly Asn Leu Thr Pro Gin 
65 70 75 80 

Glu Gin Leu Leu lie Leu Glu Leu His Ser Lys Trp Gly Asn Arg Trp 
85 90 95 

Ser Lys lie Ala Gin Tyr Leu Pro Gly Arg Thr Asp Asn Glu lie Lys 
100 105 110 

Asn Tyr Trp Arg Thr Arg Val Gin Lys Gin Ala Arg Gin Leu Asn lie 
115 120 125 

Glu Ser Asn Ser Asp Lys Phe Phe Asp Ala Val Arg Ser Phe Trp Val 
130 135 140 

Pro Arg Leu lie Glu Lys Met Glu Gin Asn Ser Ser Thr Thr Thr Thr 
145 150 155 160 

Tyr Cys Cys Pro Gin Asn Asn Asn Asn Asn Ser Leu Leu Leu Pro Ser 
165 170 175 

Gin Ser His Asp Ser Leu Ser Met Gin Lys Asp lie Asp Tyr Ser Gly 
180 185 190 

Phe Ser Asn He Asp Gly Ser Ser Ser Thr Ser Thr Cys Met Ser His 
195 200 205 

Leu Thr Thr Val Pro His Phe Met Asp Gin Ser Asn Thr Asn He lie 
210 215 220 

Asp Gly Ser Met Cys Phe His Glu Gly Asn Val Gin Glu Phe Gly Gly 
225 230 235 240 

Tyr Val Pro Gly Met Glu Asp Tyr Met Val Asn Ser Asp He Ser Met 
245 250 255 

Glu Cys His Val Ala Asp Gly Tyr Ser Ala Tyr Glu Asp Val Thr Gin 
260 265 270 
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Asp Pro Met Trp Asn Val Asp Asp lie Trp Gin Phe Arg Glu 

280 285 





275 


<210> 


25 


<211> 


1630 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(97) . . (1398) 


<223> 


G1337 


<400> 


25 



aatggatttg tcatcattct tctcaccgtc cttagtctct gaaaataaat tctgattttg 60 

atttcgaatt ttagggattt tgagagagag tcagtt atg agt agt teg gag aga 114 

Met Ser Ser Ser Glu Arg 
1 5 



gta ccg tgc gat ttc tgc ggc gag cgt acg gcg gtt ttg ttt tgt aga 
Val Pro Cys Asp Phe Cys Gly Glu Arg Thr Ala Val Leu Phe Cys Arg 
10 15 20 



162 



gec gat acg gcg aag ctg tgt ttg cct tgt gat cag caa gtt cac acg 
Ala Asp Thr Ala Lys Leu Cys Leu Pro Cys Asp Gin Gin Val His Thr 
25 30 35 



210 



gcg aat ctg ttg teg agg aag cac gtg cga tct cag ate tgc gat aat 
Ala Asn Leu Leu Ser Arg Lys His Val Arg Ser Gin lie Cys Asp Asn 
40 45 50 



258 



tgc ggt aac gag cca gtc tct gtt egg tgt ttc ace gat aat ctg att 
Cys Gly Asn Glu Pro val Ser Val Arg Cys Phe Thr Asp Asn Leu lie 
55 60 65 70 



306 



ttg tgt cag gag tgt gat tgg gat gtt cac gga agt tgt tea gtt tec 
Leu Cys Gin Glu Cys Asp Trp Asp Val His Gly Ser Cys Ser Val Ser 
75 80 85 



354 



gat get cat gtt cga tec gec gtg gaa ggt ttt tec ggt tgt cca teg 
Asp Ala His Val Arg Ser Ala Val Glu Gly Phe Ser Gly Cys Pro Ser 
90 95 100 



402 



gcg ttg gag ctt get get tta tgg gga ctt gat ttg gag caa ggg agg 
Ala Leu Glu Leu Ala Ala Leu Trp Gly Leu Asp Leu Glu Gin Gly Arg 
105 110 115 



450 



aaa gat gaa gag aat caa gtt ccg atg atg gcg atg atg atg gat aat 
Lys Asp Glu Glu Asn Gin Val Pro Met Met Ala Met Met Met Asp Asn 
120 125 130 



498 



ttc ggg atg cag ttg gat tct tgg gtt ttg gga tct aat gaa ttg att 
Phe Gly Met Gin Leu Asp Ser Trp Val Leu Gly Ser Asn Glu Leu lie 
135 140 145 150 



546 



gtt ccc age gat acg acg ttt aag aag cgt gga tct tgt gga tct agt 
Val Pro Ser Asp Thr Thr Phe Lys Lys Arg Gly Ser Cys Gly Ser Ser 
155 160 165 



594 



tgt ggg agg tat aag cag gta ttg tgt aag cag ctt gag gag ttg ctt 
Cys Gly Arg Tyr Lys Gin Val Leu Cys Lys Gin Leu Glu Glu Leu Leu 
170 175 180 



642 



aag agt ggt gtt gtc ggt ggt gat ggc gat gat ggt gat cgt gac cgt 
Lys Ser Gly Val Val Gly Gly Asp Gly Asp Asp Gly Asp Arg Asp Arg 
185 190 195 



690 



gat tgt gac cgt gag ggt get tgt gat gga gat gga gat gga gaa gca 
Asp Cys Asp Arg Glu Gly Ala Cys Asp Gly Asp Gly Asp Gly Glu Ala 
200 205 210 
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gga gag ggg ctt atg gtt ccg gag atg tea gag aga ttg aaa tgg tea 786 
Gly Glu Gly Leu Met Val Pro Glu Met Ser Glu Arg Leu Lys Trp Ser 
215 220 225 230 

aga gat gtt gag gag ate aat ggt ggc gga gga gga gga gtt aac cag 834 
Arg Asp Val Glu Glu lie Asn Gly Gly Gly Gly Gly Gly Val Asn Gin - 
235 240 245 

cag tgg aat get act act act aat cct agt ggt ggc cag agt tct cag 882 
Gin Trp Asn Ala Thr Thr Thr Asn Pro Ser Gly Gly Gin Ser Ser Gin 
250 255 260 

ata tgg gat ttt aac ttg gga cag tea egg gga cct gag gat acg agt 930 
lie Trp Asp Phe Asn Leu Gly Gin Ser Arg Gly Pro Glu Asp Thr Ser 
265 270 275 

cga gtg gaa get gca tat gta ggg aaa ggt get get tct tea ttc aca 978 
Arg Val Glu Ala Ala Tyr Val Gly Lys Gly Ala Ala Ser Ser Phe Thr 
280 265 290 

ate aac aat ttt gtt gac cat atg aat gaa act tgt tec act aat gtg 1026 
lie Asn Asn Phe Val Asp His Met Asn Glu Thr Cys Ser Thr Asn Val 
295 300 305 310 

aaa ggt gtc aaa gag att aaa aag gat gac tac aag cga tea act tea 1074 
Lys Gly Val Lys Glu lie Lys Lys Asp Asp Tyr Lys Arg Ser Thr Ser 
315 320 325 

ggc cag gta caa cca aca aaa tct gag age aac aat cgt cca att acc 1122 
Gly Gin val Gin Pro Thr Lys Ser Glu Ser Asn Asn Arg Pro lie Thr 
330 335 340 

ttt ggc tct gag aaa ggt teg aac tec tec agt gac ttg cat ttc aca 1170 
Phe Gly Ser Glu Lys Gly Ser Asn Ser Ser Ser Asp Leu His Phe Thr 
345 350 355 

gag cat att get gga act agt tgt aag acc aca aga eta gtt gca act 1218 
Glu His lie Ala Gly Thr Ser Cys Lys Thr Thr Arg Leu Val Ala Thr 
360 365 370 

aag get gat ctg gag egg ctg get cag aac aga gga gat gca atg cag 1266 
Lys Ala Asp Leu Glu Arg Leu Ala Gin Asn Arg Gly Asp Ala Met Gin 
375 380 385 390 

cgt tac aag gaa aag agg aag aca egg aga tat gat aag acc ata agg 1314 
Arg Tyr Lys Glu Lys Arg Lys Thr Arg Arg Tyr Asp Lys Thr lie Arg 
395 400 • 405 

tat gaa teg agg aag gca aga get gac act agg ttg cgt gtc aga ggc 1362 
Tyr Glu Ser Arg Lys Ala Arg Ala Asp Thr Arg Leu Arg Val Arg Gly 
410 415 420 

aga ttt gtg aaa get agt gaa get cct tac cct taa ccttaagttt 1408 
Arg Phe Val Lys Ala Ser Glu Ala Pro Tyr Pro 
425 430 

tttcacatag gcttcctttt agctacaaac ttagttactt tttttactcc actgcctcat 1468 

aaatgtacag accggtctcg tttcatctgg ccgcccttct tgttttattg ccttatctgg 1528 

cccttttatg taccttggaa tcttatctag tttaaaaaag attgtaacct tctagaaaac 1588 

catattctgt tgacagtata tacatgtcta tccaagcaaa aa 1630 

c210> 26 

<211> 433 

<212> PRT 

<213> .Arabidopsis thaliana 

<400> 26 

Met Ser Ser Ser Glu Arg Val Pro Cys Asp Phe Cys Gly Glu Arg Thr 
1 5 10 * 15 
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Ala Val Leu Phe Cys Arg Ala Asp Thr Ala Lys Leu Cys Leu Pro Cys 
20 25 30 



Asp Gin Gin Val His Thr Ala Asn Leu Leu Ser Arg Lys His Val Arg 
35 40 45 



Ser Gin He Cys Asp Asn Cys Gly Asn Glu Pro Val Ser Val Arg Cys 
50 55 60 



Phe Thr Asp Asn Leu He Leu Cys Gin Glu Cys Asp Trp Asp Val His 
65 70 75 80 



Gly Ser Cys Ser Val Ser Asp Ala His Val Arg Ser Ala Val Glu Gly 
85 90 95 



Phe Ser Gly Cys Pro Ser Ala Leu Glu Leu Ala Ala Leu Trp Gly Leu 
100 105 110 



Asp Leu Glu Gin Gly Arg Lys Asp Glu Glu Asn Gin Val Pro Met Met 
115 120 125 



Ala Met Met Met Asp Asn Phe Gly Met Gin Leu Asp Ser Trp Val Leu 
130 135 140 



Gly Ser Asn Glu Leu He Val Pro Ser Asp Thr Thr Phe Lys Lys Arg 
145 150 155 160 



Gly Ser Cys Gly Ser Ser Cys Gly Arg Tyr Lys Gin Val Leu Cys Lys 
< 165 170 175 



Gin Leu Glu Glu Leu Leu Lys Ser Gly Val Val Gly Gly Asp Gly Asp 
180 185 190 



Asp Gly Asp Arg Asp Arg Asp Cys Asp Arg Glu Gly Ala Cys Asp Gly 
195 200 205 



Asp Gly Asp Gly Glu Ala Gly Glu Gly Leu Met Val Pro Glu Met Ser 
210 215 -220 



Glu Arg Leu Lys Trp Ser Arg Asp Val Glu Glu He Asn Gly Gly Gly 
225 230 235 240 



Gly Gly Gly Val Asn Gin Gin Trp Asn Ala Thr Thr Thr Asn Pro Ser 
245 250 255 



Gly Gly Gin Ser Ser Gin He Trp Asp Phe Asn Leu Gly Gin Ser Arg 
260 265 270 



Gly Pro Glu Asp Thr Ser Arg Val Glu Ala Ala Tyr Val Gly Lys Gly 
275 280 285 



Ala Ala Ser Ser Phe Thr He Asn Asn Phe Val Asp His Met Asn Glu 
290 295 300 



Thr Cys Ser Thr Asn Val Lys Gly Val Lys Glu He Lys Lys Asp Asp 
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305 310 315 320 



Tyr Lys Arg Ser Thr Ser Gly Gin Val Gin Pro Thr Lys Ser Glu Ser 
325 330 335 



Asn Asn Arg Pro lie Thr Phe Gly Ser Glu Lys Gly Ser Asn Ser Ser 
340 345 350 

Ser Asp Leu His Phe Thr Glu His He Ala Gly Thr Ser Cys Lys Thr 
355 360 365 

Thr Arg Leu Val Ala Thr Lys Ala Asp Leu Glu Arg Leu Ala Gin Asn 
370 375 380 

Arg Gly Asp Ala Met Gin Arg Tyr Lys Glu Lys Arg Lys Thr Arg Arg 

385 390 395 400 

Tyr Asp Lys Thr He Arg Tyr Glu Ser Arg Lys Ala Arg Ala Asp Thr 
405 410 415 

Arg Leu Arg Val Arg Gly Arg Phe Val Lys Ala Ser Glu Ala .Pro Tyr 
420 425 430 



Pro 



<210> 27 

<211> 768 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

<222> (58).. (657) 

<223> G975 

<400> 27 

attactcatc atcaagttcc tactttctct ctgacaaaca tcacagagta agtaaga 57 

atg gta cag acg aag aag ttc aga ggt gtc agg caa cgc cat tgg ggt 105 
Met Val Gin Thr Lys Lys Phe Arg Gly Val Arg Gin Arg His Trp Gly 
1 5 10 15 

tct tgg gtc get gag att cgt cat cct etc ttg aaa egg agg att tgg 153 
Ser Trp Val Ala Glu He Arg His Pro Leu Leu Lys Arg Arg He Trp 
20 25 30 

eta ggg acg ttc gag acc gca gag gag gca gca aga gca tac gac gag 201 
Leu Gly Thr Phe Glu Thr Ala Glu Glu Ala Ala Arg Ala Tyr Asp Glu 
35 40 45 

gec gec gtt tta atg age ggc cgc aac gec aaa acc aac ttt ccc etc 249 
Ala Ala Val Leu Met Ser Gly Arg Asn Ala Lys Thr Asn Phe Pro Leu 
50 55 60 

aac aac aac aac acc gga gaa act tec gag ggc aaa acc gat att tea 297 
Asn Asn Asn Asn Thr Gly Glu Thr Ser Glu Gly Lys Thr Asp He Ser 
65 70 75 80 

get teg tec aca atg tea tec tea aca tea tct tea teg etc tct tec 345 
Ala Ser Ser Thr Met Ser Ser Ser Thr Ser Ser Ser Ser Leu Ser Ser 
85 90 95 



ate etc age gee aaa ctg agg aaa tgc tgc aag tct cct tec cca tec 
He Leu Ser Ala Lys Leu Arg Lys Cys Cys Lys Ser Pro Ser Pro Ser 
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100 105 110 

etc acc tgc etc cgt ctt gac aca gec age tec cat ate ggc gtc tgg 441 
Leu Thr Cys Leu Arg Leu Asp Thr Ala Ser Ser His lie Gly Val Trp 
115 120 125 

cag aaa egg gec ggt tea aag tct gac tec age tgg gtc atg acg gtg ■ 489 
Gin Lys Arg Ala Gly Ser Lys Ser Asp Ser Ser Trp Val Met Thr Val 
130 135 140 

gag eta ggt ccc gca age tec tec caa gag act act agt aaa get tea 537 
Glu Leu Gly Pro Ala Ser Ser Ser Gin Glu Thr Thr Ser Lys Ala Ser 
145 150 155 160 

caa gac get att ctt get ccg acc act gaa gtt gaa att ggt ggc age 585 
Gin Asp Ala He Leu Ala Pro Thr Thr Glu Val Glu He Gly Gly Ser 

165 170 175 

aga gaa gaa gta ttg gat gag gaa gaa aag gtt get ttg caa atg ata 633 
Arg Glu Glu Val Leu ABp Glu Glu Glu Lys Val Ala Leu Gin Met He 
180 185 190 

gag gag ctt etc aat aca aac taa atcttatttg cttatatata tgtacctatt 687 
Glu Glu Leu Leu Asn Thr Asn 
195 

ttcattgctg atttacagee aaaataatca attataccgt gtattttata gatgttttat 747 
attaaaaggt tgttagatat a 768 

<210> 28 
<211> 199 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 28 

Met Val Gin Thr Lys Lys Phe Arg Gly Val Arg Gin Arg His Trp Gly 
15 10 15 

Ser Trp Val Ala Glu He Arg His Pro Leu Leu Lys Arg Arg He Trp 
20 25 30 

Leu Gly Thr Phe Glu Thr Ala Glu Glu Ala Ala Arg Ala Tyr Asp Glu 
35 40 45 

Ala Ala Val Leu Met Ser Gly Arg Asn Ala Lys Thr Asn Phe Pro Leu 
50 55 60 



Asn Asn Asn Asn Thr Gly Glu Thr Ser Glu Gly Lys Thr Asp He Ser 
65 70 75 80 



Ala Ser Ser Thr Met Ser Ser Ser Thr Ser Ser Ser Ser Leu Ser Ser 

85 90 95 



He Leu Ser Ala Lys Leu Arg Lys Cys Cys Lys Ser Pro Ser Pro Ser 
100 105 110 

Leu Thr Cys Leu Arg Leu Asp Thr Ala Ser Ser His He Gly Val Trp 
115 120 125 

Gin Lys Arg Ala Gly Ser Lys Ser Asp Ser Ser Trp Val Met Thr Val 
130 135 140 

Glu Leu Gly Pro Ala Ser Ser Ser Gin Glu Thr Thr Ser Lys Ala Ser 
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145 ISO 155 160 

Gin Asp Ala He Leu Ala Pro Thr Thr Glu Val Glu He Gly Gly Ser 
165 170 175 



Arg Glu Glu Val Leu Asp Glu Glu Glu Lys Val Ala Leu Gin Met He 
180 185 190 



Glu Glu Leu Leu Asn Thr Asn 
195 



<210> 


29 


<211> 


2526 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(338) . - (2275) 


<223> 


G680 


<400> 


29 



cagttatctt cttccttctt ctctctgttt tttaaattta tttttagaga attttttttg 60 

ttttgcttcc gatttgatta tttccgggaa cgatgacttc tccggggagt tcccggtgag 120 

atgataagtc agattgcata cttgtctcct ccatggctac tctcaagggt tttggctgcg 180 

gtggattcgt ttggtttctc tagaatctaa agaggttatc acaacggctt tgcaatttga 240 

aaactttcat gtttggggag atcaaagatg gtttcttttt tatactttac ttgttagaga 300 

ggatttgaag cagcgaatag ctgcaaccgg tcctgtt atg gat act aat aca tct 355 

Met Asp Thr Asn Thr Ser 
1 5 

gga gaa gaa tta tta get aag gca aga aag cca tat aca ata aca aag 403 
Gly Glu Glu Leu Leu Ala Lys Ala Arg Lys Pro Tyr Thr He Thr Lys 
10 15 20 

cag cga gag cga tgg act gag gat gag cat gag agg ttt eta gaa gec 451 
Gin Arg Glu Arg Trp Thr Glu Asp Glu His Glu Arg Phe Leu Glu Ala 
25 30 35 

ttg agg ctt tat gga aga get tgg caa cga att gaa gaa cat att ggg 499 
Leu Arg Leu Tyr Gly Arg Ala Trp Gin Arg lie Glu Glu His He Gly 
40 45 50 

aca aag act get gtt cag ate aga agt cat gca caa aag ttc ttc aca 547 
Thr Lys Thr Ala Val Gin He Arg Ser His Ala Gin Lys Phe Phe Thr 
55 60 65 70 

aag ttg gag aaa gag get gaa gtt aaa ggc ate cct gtt tgc caa get 595 
Lys Leu Glu Lys Glu Ala Glu Val Lys Gly He Pro Val Cys Gin Ala 
75 80 85 

ttg gac ata gaa att ccg cct cct cgt cct aaa cga aaa ccc aat act 643 
Leu Asp He Glu He Pro Pro Pro Arg Pro Lys Arg Lys Pro Asn Thr 
90 95 100 

cct tat cct cga aaa cct ggg aac aac ggt aca tct tec tct caa gta 691 
Pro Tyr Pro Arg Lys Pro Gly Asn Asn Gly Thr Ser Ser Ser Gin Val 
105 110 115 

tea tea gca aaa gat gca aaa ctt gtt tea teg gee tct tct tea cag 739 
Ser Ser Ala Lys Asp Ala Lys Leu Val Ser Ser Ala Ser Ser Ser Gin 
120 125 130 

ttg aat cag gcg ttc ttg gat ttg gaa aaa atg ccg ttc tct gag aaa 787 
Leu Asn Gin Ala Phe Leu Asp Leu Glu Lys Met Pro Phe Ser Glu Lys 
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135 



140 



MBI-20 Sequence Listing. ST25 
145 



150 



aca tea act 
Thr Ser Thr 



gga aaa gaa aat caa gat gag aat tgc teg 
Gly Lys Glu Asn Gin Asp Glu Asn Cys Ser 
155 160 



ggt gtt tct 
Gly Val Ser 
165 



835 



act gtg aac 
Thr Val Asn 



aag tat ccc tta cca acg aaa cag gta agt 
Lys Tyr Pro Leu Pro Thr Lys Gin Val Ser 
170 175 



ggc gac att 
Gly Asp lie 
180 



683 



gaa aca agt 
Glu Thr Ser 
185 



aag acc tea act gtg gac aac gcg gtt caa 
Lys Thr Ser Thr Val Asp Asn Ala Val Gin 
190 195 



gat gtt ccc 
Asp Val Pro 



931 



aag aag aac 
Lys Lys Asn 
200 



aaa gac aaa gat ggt aac gat ggt act act 
Lys Asp Lys Asp Gly Asn Asp Gly Thr Thr 
205 210 



gtg cac age 
Val His Ser 



979 



atg caa aac 
Met Gin Asn 
215 



tac cct tgg cat ttc cac gca gat att gtg 
Tyr Pro Trp His Phe His Ala Asp lie Val 
220 225 



aac ggg aat 
Asn Gly Asn 
230 



1027 



ata gca aaa 
He Ala Lys 



tgc cct caa aat cat ccc tea .ggt atg gta 
Cys Pro Gin Asn His Pro Ser Gly Met Val 
235 240 



tct caa gac 
Ser Gin Asp 
245 



1075 



ttc atg ttt 
Phe Met Phe 



cat cct atg aga gaa gaa act cac ggg cac 
His Pro Met Arg Glu Glu Thr His Gly His 
250 255 



gca aat ctt 
Ala Asn Leu 
260 



1123 



caa get aca 
Gin Ala Thr 
265 



aca gca tct get act act aca get tct cat 
Thr Ala Ser Ala Thr Thr Thr Ala Ser His 
270 275 



caa gcg ttt 
Gin Ala Phe 



1171 



cca get tgt 
Pro Ala Cys 
280 



cat tea cag gat gat tac cgt teg ttt etc 
His Ser Gin Asp Asp Tyr Arg Ser Phe Leu 
285 290 



cag ata tea 
Gin He Ser 



1219 



tct act ttc 
Ser Thr Phe 
295 



tec aat ctt att atg tea act etc eta cag 
Ser Asn Leu He Met Ser Thr Leu Leu Gin 
300 305 



aat cct gca 
Asn Pro Ala 
310 



1267 



get cat get 
Ala His Ala 



gca get aca ttc get get teg gtc tgg cct 
Ala Ala Thr Phe Ala Ala Ser Val Trp Pro 
315 320 



tat gcg agt 
Tyr Ala Ser 
325 



1315 



gtc ggg aat 
Val Gly Asn 



tct ggt gat cca tea acc cca atg age tct 
Ser Gly Asp Ser Ser Thr Pro Met Ser Ser 
330 335 



tct cct cca 
Ser Pro Pro 
340 



1363 



agt ata act 
Ser He Thr 
345 



gee att gee get get aca gta get get gca 
Ala He Ala Ala Ala Thr Val Ala Ala Ala 
350 355 



act get tgg 
Thr Ala Trp 



1411 



tgg get tct 
Trp Ala Ser 
360 



cat gga ctt ctt cct gta tgc get cca get 
His Gly Leu Leu Pro Val Cys Ala Pro Ala 
365 370 



cca ata aca 
Pro He Thr 



1459 



tgt gtt cca 
Cys Val Pro 
375 



ttc tea act gtt gca gtt cca act cca gca 
Phe Ser Thr Val Ala Val Pro Thr Pro Ala 
380 385 



atg act gaa 
Met Thr Glu 
390 



1507 



atg gat acc 
Met Asp Thr 



gtt gaa aat act caa ccg ttt gag aaa caa 
Val Glu ABn Thr Gin Pro Phe Glu Lys Gin 
395 400 



aac aca get 
Asn Thr Ala 
405 



1555 



ctg caa gat 
Leu Gin Asp 



caa acc ttg get teg aaa tct cca get tea 
Gin Thr Leu Ala Ser Lys Ser Pro Ala Ser 
410 415 



tea tct gat 
Ser Ser Asp 
420 



1603 



gat tea gat 
Asp Ser Asp 
425 



gag act gga gta acc aag eta aat gec gac 
Glu Thr Gly Val Thr Lys Leu Asn Ala Asp 
430 435 



tea aaa acc 
Ser Lys Thr 



1651 



aat gat gat aaa att gag gag 



gtt gtt gtt act gee get gtg cat gac 
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Asn Asp Asp Lys He Glu Glu Val Val Val Thr Ala Ala Val His Asp 
440 445 450 

tea aac act gec cag aag aaa aat ctt gtg gac cgc tea teg tgt ggc 174 7 

Ser Asn Thr Ala Gin Lys Lys Asn Leu Val Asp Arg Ser Ser Cys Gly 
455 460 465 470 

tea aat aca cct tea ggg agt gac gca gaa act gat gca tta gat aaa 1795 

Ser Asn Thr Pro Ser Gly Ser Asp Ala Glu Thr Asp Ala Leu Asp Lys 
475 480 485 

atg gag aaa gat aaa gag gat gtg aag gag aca gat gag aat cag cca 1843 
Met Glu Lys Asp Lys Glu Asp Val Lys Glu Thr Asp Glu Asn Gin Pro 

490 495 500 



gat gtt att gag tta aat aac cgt aag att aaa atg aga gac aac aac 
Asp Val He Glu Leu Asn Asn Arg Lys He Lys Met Arg Asp Asn Asn 
505 510 515 



age ttt teg cct cct caa gtg gca gag aat gtg aat aga aaa caa agt 

Ser Phe Ser Pro Pro Gin Val Ala Glu Asn Val Asn Arg Lys Gin Ser 

555 560 565 

gac acg tea atg cca ttg get cct aat ttc aaa age cag gat tct tgt 

Asp Thr Ser Met Pro Leu Ala Pro Asn Phe Lys Ser Gin Asp Ser Cys 

570 575 580 



agt ctt aaa acg aga cag aca gga ttt aag cca tac aag aga tgt tea 
Ser Leu Lys Thr Arg Gin Thr Gly Phe Lys Pro Tyr Lys Arg Cys Ser 
600 605 610 



gaa aaa gtc tgc aaa agg ctt cga ttg gaa gga gaa get tct aca tga 
Glu Lys Val Cys Lys Arg Leu Arg Leu Glu Gly Glu Ala Ser Thr 
635 640 645 



<210> 30 
<211> 645 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 30 

Met Asp Thr Asn Thr Ser Gly Glu Glu Leu Leu Ala Lys Ala Arg Lys 
15 10 15 

Pro Tyr Thr He Thr Lys Gin Arg Glu Arg Trp Thr Glu Asp Glu His 
20 25 30 
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1891 



age aac aac aat gca act act gat teg tgg aag gaa gtc tec gaa gag 1939 

Ser Asn Asn Asn Ala Thr Thr Asp Ser Trp Lys Glu Val Ser Glu Glu 
520 525 530 

ggt cgt ata gcg ttt cag get etc ttt gca aga gaa aga ttg cct caa 1987 

Gly Arg He Ala Phe Gin Ala Leu Phe Ala Arg Glu Arg Leu Pro Gin 

535 540 545 550 



2035 



2083 



get gca gac caa gaa gga gta gta atg ate ggt gtt gga aca tgc aag 2131 
Ala Ala Asp Gin Glu Gly Val Val Met He Gly Val Gly Thr Cys Lys 
585 590 595 



2179 



atg gaa gtg aaa gag age caa gtt ggg aac ata aac aat caa agt gat 2227 
Met Glu Val Lys Glu Ser Gin Val Gly Asn He Asn Asn Gin Ser Asp 
615 620 625 630 



2275 



cagacttgga 


ggtaaaaaaa 


aaacatccac 


atttttatca atatctttaa 


atctagtgtt 


2335 


agtagtttgc 


ttctccaatc 


tttatgaaag 


agacttttaa ttttccttcc 


gaacatttct 


2395 


ttggtcatgt 


caggttctgt 


accatattac 


cccatgtctt gtctcttgtc 


tctgtttgtg 


2455 


tatgetaett 


gtggtctata 


tgtcatctgc 


tactactgtt aattaaccat 


taagcaatgg 


2515 


atttgtcttt 


a 








2526 
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Glu Arg Phe Leu Glu Ala Leu Arg Leu Tyr Gly Arg Ala Trp Gin Arg 
35 40 45 

lie Glu Glu His He Gly Thr Lys Thr Ala Val Gin He Arg Ser His 
50 55 60 

Ala Gin Lys Phe Phe Thr Lys Leu Glu Lys Glu Ala Glu Val Lys Gly 
65 70 75 80 

He Pro Val Cys Gin Ala Leu Asp He Glu He Pro Pro Pro Arg Pro 
85 90 95 

Lys Arg Lys Pro Asn Thr P>ro Tyr Pro Arg Lys Pro Gly Asn Asn Gly 
100 105 HO 

Thr Ser Ser Ser Gin Val Ser Ser Ala Lys Asp Ala Lys Leu Val Ser 
115 120 125 

Ser Ala Ser Ser Ser Gin Leu Asn Gin Ala Phe Leu Asp Leu Glu Lys 
130 135 140 

Met Pro Phe Ser Glu Lys Thr Ser Thr Gly Lys Glu Asn Gin Asp Glu 
145 150 155 160 

Asn Cys Ser Gly Val Ser Thr Val Asn Lys Tyr Pro Leu Pro Thr Lys 
165 170 175 

Gin Val Ser Gly Asp He Glu Thr Ser Lys Thr Ser Thr Val Asp Asn 
180 185 190 

Ala Val Gin Asp Val Pro Lys Lys Asn Lys Asp Lys Asp Gly Asn Asp 
195 200 205 

Gly Thr Thr Val His Ser Met Gin Asn Tyr Pro Trp His Phe His Ala 
210 215 220 

Asp He Val Asn Gly Asn He Ala Lys Cys Pro Gin Asn His Pro Ser 
225 230 235 240 

Gly Met Val Ser Gin Asp Phe Met Phe His Pro Met Arg Glu Glu Thr 
245 250 255 

His Gly His Ala Asn Leu Gin Ala Thr Thr Ala Ser Ala Thr Thr Thr 
260 265 270 

Ala Ser His Gin Ala Phe Pro Ala Cys His Ser Gin Asp Asp Tyr Arg 
275 280 285 

Ser Phe Leu Gin lie Ser Ser Thr Phe Ser Asn Leu He Met Ser Thr 
290 295 300 

Leu Leu Gin Asn Pro Ala Ala His Ala Ala Ala Thr Phe Ala Ala Ser 
305 310 315 320 



Val Trp Pro Tyr Ala Ser Val Gly Asn Ser Gly Asp Ser Ser Thr Pro 

Page 44 



WO 01/36597 PO7US00/31344 

MBI-20 Sequence Listing. ST25 
325 330 335 

Met Ser Ser Ser Pro Pro Ser lie Thr Ala He Ala Ala Ala Thr Val 
340 345 350 

Ala Ala Ala Thr Ala Trp Trp Ala Ser His Gly Leu Leu Pro Val Cys 
355 360 365 

Ala Pro Ala Pro He Thr Cys Val Pro Phe Ser Thr Val Ala Val Pro 
370 375 380 

Thr Pro Ala Met Thr Glu Met Asp Thr Val Glu Asn Thr Gin Pro Phe 
385 390 395 400 

Glu Lys Gin Asn Thr Ala Leu Gin Asp Gin Thr Leu Ala Ser Lys Ser 
405 410 415 

Pro Ala Ser Ser Ser Asp Asp Ser Asp Glu Thr Gly Val Thr Lys Leu 
420 425 430 

Asn Ala Asp Ser Lys Thr Asn Asp Asp Lys He Glu Glu Val Val Val 
435 440 445 

Thr Ala Ala Val His Asp Ser Asn Thr Ala Gin Lys Lys Asn Leu Val 
450 455 460 

Asp Arg Ser Ser Cys Gly Ser Asn Thr Pro Ser Gly Ser Asp Ala Glu 
465 470 475 480 

Thr Asp Ala Leu Asp Lys Met Glu Lys Asp Lye Glu Asp Val Lys Glu 
485 490 495 

Thr Asp Glu Asn Gin Pro Asp Val He Glu Leu Asn Asn Arg Lys He 
500 505 510 

Lys Met Arg Asp Asn Asn Ser Asn Asn Asn Ala Thr Thr Asp Ser Trp 
515 520 525 

Lys Glu Val Ser Glu Glu Gly Arg He Ala Phe Gin Ala Leu Phe Ala 
530 535 540 

Arg Glu Arg Leu Pro Gin Ser Phe Ser Pro Pro Gin Val Ala Glu Asn 
545 550 555 560 

Val Asn Arg Lys Gin Ser Asp Thr Ser Met Pro Leu Ala Pro Asn Phe 
565 570 575 

Lys Ser Gin Asp Ser Cys Ala Ala Asp Gin Glu Gly Val Val Met lie 
580 585 590 

Gly Val Gly Thr Cys Lys Ser Leu Lys Thr Arg Gin Thr Gly Phe Lys 
595 600 605 

Pro Tyr Lys Arg Cys Ser Met Glu Val Lys Glu Ser Gin Val Gly Asn 
610 615 620 
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lie Asn Asn Gin Ser Asp Glu Lys Val Cys Lys Arg Leu Arg Leu Glu 
625 630 635 640 

Gly Glu Ala Ser Thr 
645 



<210> 


31 


<211> 


1195 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(67) . . (1041) 


<223> 


G883 


<400> 


31 



ctctctcgtc ttcgtcttct tcttcttcaa cgttcctctc caaaatcctc agaccaagaa 60 

atcatc atg gcc gtc gat eta atg cgt ttc cct aag ata gat gat caa 108 

Met Ala Val Asp Leu Met Arg Phe Pro Lys lie Asp Asp Gin 

15 10 

acg get att cag gaa get gca teg caa ggt tta caa agt atg gaa cat 156 

Thr Ala He Gin Glu Ala Ala Ser Gin Gly Leu Gin Ser Met Glu His 
15 20 25 30 

ctg ate cgt gtc etc tct aac cgt ccc gaa caa caa cac aac gtt gac 204 

Leu He Arg Val Leu Ser Asn Arg Pro Glu Gin Gin His Asn Val Asp 
35 40 45 

tgc tec gag ate act gac ttc acc gtt tct aaa ttc aaa ace gtc att 252 

Cys Ser Glu He Thr Asp Phe Thr Val Ser Lys Phe Lys Thr Val He 
50 55 60 

tct etc ctt aac cgt act ggt cac get egg ttc aga cgc gga ccg gtt 3 00 

Ser Leu Leu Asn Arg Thr Gly His Ala Arg Phe Arg Arg Gly Pro Val 

65 70 75 

cac tec act tec tct gcc gca tct cag aaa eta cag agt cag ate gtt 34 8 

His Ser Thr Ser Ser Ala Ala Ser Gin Lys Leu Gin Ser Gin He Val 

80 85 90 

aaa aat act caa cct gag get ccg ata gtg aga aca act acg aat cac 396 

Lys Asn Thr Gin Pro Glu Ala Pro He Val Arg Thr Thr Thr Asn His 
95 - 100 105 110 

cct caa ate gtt cct cca ccg tct agt gta aca etc gat ttc tct aaa 444 

Pro Gin He Val Pro Pro Pro Ser Ser Val Thr Leu Asp Phe Ser Lys 
115 120 125 

cca age ate ttc ggc acc aaa get aag age gcc gag ctg gaa ttc tec 492 

Pro Ser He Phe Gly Thr Lye Ala Lys Ser Ala Glu Leu Glu Phe Ser 
130 135 140 

aaa gaa aac ttc agt gtt tct tta aac tec tea ttc atg teg teg gcg 540 

Lys Glu Asn Phe Ser Val Ser Leu Asn Ser Ser Phe Met Ser Ser Ala 

145 150 155 

ata acc gga gac ggc age gtc tec aat gga aaa ate ttc ctt get tct 588 

He Thr Gly Asp Gly Ser Val Ser Asn Gly Lys He Phe Leu Ala Ser 

160 165 170 

get ccg teg cag cct gtt aac tct tec gga aaa cca ccg ttg get ggt 63 6 

Ala Pro Ser Gin Pro Val Asn Ser Ser Gly Lys Pro Pro Leu Ala Gly 
175 180 185 190 

cat cct tac aga aag aga tgt etc gag cat gag cac tea gag agt ttc 684 

His Pro Tyr Arg Lys Arg Cys Leu Glu His Glu His Ser Glu Ser Phe 
195 200 205 

tec gga aaa gtc tec ggc tec gcc tac gga aag tgc cat tgc aag aaa 732 
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MBI-20 Sequence Listing. ST25 
Ser Gly Lys Val Ser Gly Ser Ala Tyr Gly Lys Cys His Cya Lys Lys 
210 21S 220 

agg aaa aat egg atg aag aga acc gtg aga gta ccg gcg ata agt gca 780 
Arg Lys Asn Arg Met Lys Arg Thr Val Arg Val Pro Ala He Ser Ala 
225 230 235 

aag ate gec gat att cca ccg gac gaa tat teg tgg agg aag tac gga 828 
Lys He Ala Asp lie Pro Pro Asp Glu Tyr Ser Trp Arg Lys Tyr Gly 
240 245 250 

caa aaa ccg ate aag ggc tea cca cac cca cgt ggt tac tac aag tgc 876 
Gin Lys Pro He Lys Gly Ser Pro His Pro Arg Gly Tyr Tyr Lys Cys 
255 260' 265 270 

agt aca ttc aga gga tgt cca gcg agg aaa cac gtg gaa cga gca tta 924 
Ser Thr Phe Arg Gly Cys Pro Ala Arg Lys His Val Glu Arg Ala Leu 
275 280 285 

gat gat cca gcg atg ctt att gtg aca tac gaa gga gag cac cgt cat 972 
Asp Asp Pro Ala Met Leu He Val Thr Tyr Glu Gly Glu His Arg His 
290 295 300 

aac caa tec gcg atg cag gag aat att tct tct tea ggc att aat gat 1020 
Asn Gin Ser Ala Met Gin Glu Asn lie Ser Ser Ser Gly He Asn Asp 
305 310 315 

tta gtg ttt gee teg get tga cttttttttg tactatttgt tt.tttgattt 1071 
Leu Val Phe Ala Ser Ala 
320 

tttgagtact ttagatggat tgaaatttgt aaattttttt attaagaaat caatttaaat 1131 
agagaaaaat tagtggtggt gcaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1191 

1195 



aaaa 




<210> 


32 


<211> 


324 


<212> 


PRT 


<213> 


Arabidopsis thaliana 


<400> 


32 


Met Ala Val Asp Leu Met Arg 


1 


5' 



10 15 

He Gin Glu Ala Ala Ser Gin Gly Leu Gin Ser Met Glu His Leu He 
20 25 30 



Arg Val Leu Ser Asn Arg Pro Glu Gin Gin His Asn val Asp Cys Ser 
35 40 45 



Glu He Thr Asp Phe Thr Val Ser Lys Phe Lys Thr Val He Ser Leu 
50 55 60 



Leu Asn Arg Thr Gly His Ala Arg Phe Arg Arg Gly Pro Val His Ser 
65 70 75 80 



Thr Ser Ser Ala Ala Ser Gin Lys Leu Gin Ser Gin He Val Lys Asn 
85 90 95 

Thr Gin Pro Glu Ala Pro He Val Arg Thr Thr Thr Asn His Pro Gin 
100 105 110 

He Val Pro Pro Pro Ser Ser Val Thr Leu Asp Phe Ser Lys Pro Ser 
115 120 125 
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lie Phe Gly Thr Lys Ala Lys Ser Ala Glu Leu Glu Phe Ser Lys Glu 
130 135 140 



Asn Phe Ser Val Ser Leu Asn Ser Ser Phe Met Ser Ser Ala lie Thr 
145 150 155 160 



Gly Asp Gly Ser Val Ser Asn Gly Lys lie Phe Leu Ala Ser Ala Pro 

165 170 175 



Ser Gin Pro Val Asn Ser Ser Gly Lys Pro Pro Leu Ala Gly His Pro 
180 185 190 



Tyr Arg Lys Arg Cys Leu Glu His Glu His Ser Glu Ser Phe Ser Gly 
195 200 205 

Lys Val Ser Gly Ser Ala Tyr Gly Lys Cys His Cys Lys Lys Arg Lys 
210 215 220 

Asn Arg Met Lys Arg Thr Val Arg Val Pro Ala He Ser Ala Lys He 

225 230 235 240 



Ala Asp He Pro Pro Asp Glu Tyr Ser Trp Arg Lys Tyr Gly Gin Lys 

245 250 255 

Pro He Lys Gly Ser Pro His Pro Arg Gly Tyr Tyr Lys Cys Ser Thr 
260 265 270 



Phe Arg Gly Cys Pro Ala Arg Lys His Val Glu Arg Ala Leu Asp Asp 

275 280 285 

Pro Ala Met Leu He Val Thr Tyr Glu Gly Glu His Arg His Asn Gin 

290 295 300 



Ser Ala Met Gin Glu Asn He Ser Ser Ser Gly He Asn Asp Leu Val 
305 310 315 320 



Phe Ala Ser Ala 



<210> 


33 


<211> 


1902 


<212> 


DNA 


<213> 


Arabidopsis 


<220> 




<221> 


CDS 


<222> 


(1) . . (1902) 


<223> 


G1855 


<400> 


33 



atg gcg aaa gag aac agt ggt cat cat cac caa aca gaa gca aga aga 

Met Ala Lys Glu Asn Ser Gly His His His Gin Thr Glu Ala Arg Arg 

15 10 15 

aag aaa eta act ttg att ctt ggt gta agt gga etc tgc att ttg ttc 

Lys Lys Leu Thr Leu He Leu Gly Val Ser Gly Leu Cys He Leu Phe 
20 25 30 

tat gtt tta ggt gca tgg caa gec aat acc gtc cca tct tct ate teg 
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Tyr Val Leu Gly Ala Trp Gin Ala Asn Thr Val Pro Ser Ser lie Ser 
35 40 45 

aag etc gga tgc gag acg caa tea aac cct tct teg tec tct tec tct 
Lys Leu Gly Cys Glu Thr Gin Ser Asn Pro Ser Ser Ser Ser Ser Ser 
50 55 60 

tec tea tct tea gag tea get gaa eta gat ttc aaa age cat aat cag 
Ser Ser Ser Ser Glu Ser Ala Glu Leu Asp Phe Lys Ser His Asn Gin 
65 70 75 80 

att gag tta aag gaa aca aac caa acc att aag tac ttt gaa cca tgt 
lie Glu Leu Lys Glu Thr Asn Gin Thr lie Lys Tyr Phe Glu Pro Cys 
85 90 95 

gaa tta tct etc agt gag tac act cct tgt gaa gac cga caa aga gga 
Glu Leu Ser Leu Ser Glu Tyr Thr Pro Cys Glu Asp Arg Gin Arg Gly 
100 105 110 

aga aga ttc gat agg aac atg atg aaa tat aga gaa aga cat tgt cct 
Arg Arg Phe Asp Arg Asn Met Met Lys Tyr Arg Glu Arg His Cys Pro 
115 120 125 

gta aaa gat gag ctt ctt tat tgt ttg att cct cct cca cca aac tac 
val Lys Asp Glu Leu Leu Tyr Cys Leu lie Pro Pro Pro Pro Asn Tyr 
130 135 140 

aag att cca ttt aaa tgg cca caa agt aga gac tat get tgg tat gac 
Lys He Pro Phe Lys Trp Pro Gin Ser Arg Asp Tyr Ala Trp Tyr Asp 
145 150 155 160 

aat ate cct cac aag gaa ctt agt gtt gag aaa gca gtt caa aac tgg 
Asn He Pro His Lys Glu Leu Ser Val Glu Lys Ala Val Gin Asn Trp 
165 170 175 

att caa gtt gaa ggt gac cgc ttt aga ttc cct ggt ggt ggt act atg 
He Gin Val Glu Gly Asp Arg Phe Arg Phe Pro Gly Gly Gly Thr Met 
180 185 190 

ttt cct cgt gga get gat get tat ate gat gat att get agg ctt att 
Phe Pro Arg Gly Ala Asp Ala Tyr He Asp Asp He Ala Arg Leu He 
195 200 205 

cct ctt act gat ggt gga ate aga aca get att gac act gga tgt ggt 
Pro Leu Thr Asp Gly Gly He Arg Thr Ala He Asp Thr Gly Cys Gly 
210 215 220 

gtt gca agt ttt ggt get tac etc ttg aag aga gac att atg get gtg 
val Ala Ser Phe Gly Ala Tyr Leu Leu Lys Arg Asp He Met Ala Val 
225 230 235 240 

tct ttt get cca aga gac act cat gaa get cag gta cag ttt get tta 
Ser Phe Ala Pro Arg Asp Thr His Glu Ala Gin Val Gin Phe Ala Leu 
245 250 255 

gaa cgc gga gtt cct gcg ata ate ggg att atg gga tea aga aga ctt 
Glu Arg Gly Val Pro Ala He He Gly He Met Gly Ser Arg Arg Leu 
260 265 270 

cct tat cca get aga get ttt gat ctt get cat tgt tct cgt tgt ttg 
Pro Tyr Pro Ala Arg Ala Phe Asp Leu Ala His Cys Ser Arg Cys Leu 
275 280 285 

ate cct tgg ttt aaa aat gat ggt ttg tac ctt atg gag gtc gac egg 
He Pro Trp Phe Lys Asn Asp Gly Leu Tyr Leu Met Glu Val Asp Arg 
290 295 300 

gtt tta aga ccg ggc ggt tac tgg ate etc teg gga cca ccg att aac 
Val Leu Arg Pro Gly Gly Tyr Trp He Leu Ser Gly Pro Pro He Asn 
305 310 315 320 

tgg aaa cag tac tgg aga ggg tgg gag aga aca gag gag gat ttg aag 
Trp Lys Gin Tyr Trp Arg Gly Trp Glu Arg Thr Glu Glu Asp Leu Lys 
325 330 335 
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aaa gag caa gat tea ata gaa gat gta gca aag agt ctt tgc tgg aag 
Lys Glu Gin Asp Ser lie Glu Asp Val Ala Lys Ser Leu Cys Trp Lys 
340 345 350 

aaa gta act gaa aaa ggt gac tta tea att tgg caa aag cct etc aat 
Lys Val Thr Glu Lys Gly Asp Leu Ser He Trp Gin Lys Pro Leu Asn 
355 360 365 

cac att gag tgt aaa aag etc aaa caa aac aat aag tea cct ccg ata 
His He Glu Cys Lys Lys Leu Lys Gin Asn Asn Lys Ser Pro Pro He 
370 375 380 

tgc age tea gat aac gcg gat tec get tgg tac aaa gac ttg gaa act 
Cys Ser Ser Asp Asn Ala Asp Ser Ala Trp Tyr Lys Asp Leu Glu Thr 
385 390 395 400 

tgt ata aca cca tta cca gaa aca aac aat cca gat gat tea gca ggc 
Cys He Thr Pro Leu Pro Glu Thr Asn Asn Pro Asp Asp Ser Ala Gly 
405 410 415 

ggt gca etc gag gat tgg cca gac cga gca ttc gcg gta cct cca aga 
Gly Ala Leu Glu Asp Trp Pro Asp Arg Ala Phe Ala Val Pro Pro Arg 
420 425 430 

ate ate aga gga act ata cca gaa atg aac gcg gag aaa ttt aga gaa 
He He Arg Gly Thr He Pro Glu Met Asn Ala Glu Lys Phe Arg Glu 
435 440 445 

gac aac gag gtt tgg aaa gag aga ata gca cat tac aag aag ata gtc 
Asp Asn Glu Val Trp Lys Glu Arg He Ala His Tyr Lys Lys lie Val 
450 455 460 

cct gag ctt tea cat gga aga ttc agg aac att atg gac atg aac get 
Pro Glu Leu Ser His Gly Arg Phe Arg Asn He Met Asp Met Asn Ala 
465 470 475 480 

ttt etc ggc gga ttc get get tec atg ctg aaa tat ccc tea tgg gtc 
Phe Leu Gly Gly Phe Ala Ala Ser Met Leu Lys Tyr Pro Ser Trp Val 
485 490 495 

atg aac gtt gtc ccg gtc gat gca gag aaa caa acg tta ggt gtg ate 
Met Asn Val Val Pro Val Asp Ala Glu Lys Gin Thr Leu Gly Val He 
500 505 510 

tac gaa cgt gga ttg ata ggg acg tat caa gat tgg tgt gaa gga ttc 
Tyr Glu Arg Gly Leu He Gly Thr Tyr Gin Asp Trp Cys Glu Gly Phe 
515 520 525 

tea acg tat cca aga act tat gat atg att cat gca gga gga ttg ttc 
Ser Thr Tyr Pro Arg Thr Tyr Asp Met He His Ala Gly Gly Leu Phe 
530 535 540 

age tta tac gaa cat agg tgt gat ttg acg ttg ata ttg ttg gag atg 
Ser Leu Tyr Glu His Arg Cys Asp Leu Thr Leu He Leu Leu Glu Met 
545 550 555 560 

gat cga att ttg aga cca gaa gga aca gtt gtg ttg aga gat aat gtg 
Asp Arg He Leu Arg Pro Glu Gly Thr Val Val Leu Arg Asp Asn Val 
565 570 575 

gag acg ttg aat aag gta gag aag ata gtg aag gga atg aag tgg aag 
Glu Thr Leu Asn Lys Val Glu Lys He Val Lys Gly Met Lys Trp Lys 
580 585 590 

agt caa att gtt gat cat gag aaa ggt cct ttt aat cct gag aag att 
Ser Gin He Val Asp His Glu Lys Gly Pro Phe Asn Pro Glu Lys He 
595 600 605 

ctt gtt get gtt aaa act tat tgg act ggt caa cct tct gac aag aac 
Leu Val Ala Val Lys Thr Tyr Trp Thr Gly Gin Pro Ser Asp Lys Asn 
610 615 620 

aac aac aac aac aac aac aac aac aac tag 
Asn Asn Asn Asn Asn Asn Asn Asn Asn 
625 630 
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<210> 34 
<211> 633 
<212> PRT 

<213> Arabidopsis thalinana 
<400> 34 

Met Ala Lys Glu Asn Ser Gly His His His Gin Thr Glu Ala Arg Arg 
15 10 15 

Lys Lys Leu Thr Leu lie Leu Gly Val Ser Gly Leu Cys lie Leu Phe 
20 25 30 

Tyr Val Leu Gly Ala Trp Gin Ala Asn Thr Val Pro Ser Ser lie Ser 
35 40 45 

Lys Leu Gly Cys Glu Thr Gin Ser Asn Pro Ser Ser Ser Ser Ser Ser 
50 55 60 

Ser Ser Ser Ser Glu Ser Ala Glu Leu Asp Phe Lys Ser His Asn Gin 
65 70 75 80 

lie Glu Leu Lys Glu Thr Asn Gin Thr lie Lys Tyr Phe Glu Pro Cys 
85 90 95 

Glu Leu Ser Leu Ser Glu Tyr Thr Pro Cys Glu Asp Arg Gin Arg Gly 
100 105 110 

Arg Arg Phe Asp Arg Asn Met Met Lys Tyr Arg Glu Arg His Cys Pro 
115 120 125 

Val Lys Asp Glu Leu Leu Tyr Cys Leu lie Pro Pro Pro Pro Asn Tyr 
130 135 140 

Lys He Pro Phe Lys Trp Pro Gin Ser Arg Asp Tyr Ala Trp Tyr Asp 
145 150 155 160 

Asn He Pro His Lys Glu Leu Ser Val Glu Lys Ala Val Gin Asn Trp 
165 170 175 

He Gin Val Glu Gly Asp Arg Phe Arg Phe Pro Gly Gly Gly Thr Met 
180 185 190 

Phe Pro Arg Gly Ala Asp Ala Tyr He Asp Asp He Ala Arg Leu lie 
195 200 205 

Pro Leu Thr Asp Gly Gly He Arg Thr Ala lie Asp Thr Gly Cys Gly" 
210 215 220 

Val Ala Ser Phe Gly Ala Tyr Leu Leu Lys Arg Asp lie Met Ala Val 
225 230 235 240 

Ser Phe Ala Pro Arg Asp Thr His Glu Ala Gin Val Gin Phe Ala Leu 
245 250 255 

Glu Arg Gly Val Pro Ala He He Gly lie Met Gly Ser Arg Arg Leu 
260 265 270 
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Pro Tyr Pro Ala Arg Ala Phe Asp Leu Ala His Cys Ser Arg Cys Leu 
275 280 285 

lie Pro Trp Phe Lys Asn Asp Gly Leu Tyr Leu Met Glu Val Asp Arg *- 
290 295 300 

Val Leu Arg Pro Gly Gly Tyr Trp lie Leu Ser Gly Pro Pro lie Asn 
305 310 315 320 

Trp Lys Gin Tyr Trp Arg Gly Trp Glu Arg Thr Glu Glu Asp Leu Lys 
325 330 335 

Lys Glu Gin Asp Ser lie Glu Asp Val Ala Lys Ser Leu Cys Trp Lys 
340 345 350 

Lys Val Thr Glu Lys Gly Asp Leu Ser lie Trp Gin Lys Pro Leu Asn 
355 360 365 

His lie Glu Cys Lys Lys Leu Lys Gin Asn Asn Lys Ser Pro Pro lie 
370 375 380 

Cys Ser Ser Asp Asn Ala Asp Ser Ala Trp Tyr Lys Asp Leu Glu Thr 
385 390 395 400 

Cys lie Thr Pro Leu Pro Glu Thr Asn Asn Pro Asp Asp Ser Ala Gly 
405 410 415 

Gly Ala Leu Glu Asp Trp Pro Asp Arg Ala Phe Ala Val Pro Pro Arg 
420 425 430 

lie lie Arg Gly Thr He Pro Glu Met Asn Ala Glu Lys Phe Arg Glu 
435 440 445 

Asp Asn Glu Val Trp Lys Glu Arg He Ala His Tyr Lys Lys He Val 
450 455 460 

Pro Glu Leu Ser His Gly Arg Phe Arg Asn lie Met Asp Met Aen Ala 
465 470 475 480 

Phe Leu Gly Gly Phe Ala Ala Ser Met Leu Lys Tyr Pro Ser Trp Val 
485 490 495 

Met Asn Val Val Pro Val Asp Ala Glu Lys Gin Thr Leu Gly Val He 
500 505 510 

Tyr Glu Arg Gly Leu He Gly Thr Tyr Gin Asp Trp Cys Glu Gly Phe 
515 520 525 

Ser Thr Tyr Pro Arg Thr Tyr Asp Met He His Ala Gly Gly Leu Phe 
530 535 540 

Ser Leu Tyr Glu His Arg Cys Asp Leu Thr Leu He Leu Leu Glu Met 
545 550 555 560 



Asp Arg He Leu Arg Pro Glu Gly Thr Val Val Leu Arg Asp Asn Val 
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565 570 575 

Glu Thr Leu Asn Lys Val Glu Lys lie Val Lys Gly Met Lys Trp Lys 
580 585 590 

Ser Gin He Val Asp His Glu Lys Gly Pro Phe Asn Pro Glu Lys He 
595 600 605 

Leu Val Ala Val Lys Thr Tyr Trp Thr Gly Gin Pro Ser Asp Lys Asn 
610 615 620 

Asn Asn Asn Asn Asn Asn Asn Asn Asn 
625 630 

<210> 35 

<211> 2324 

<212> DNA 

<213> Arabi'dopsis thaliana 
<220> 

<221> CDS 

<222> (209) . . (2020) 

<223> G1190 

<400> 35 

tcctgtccca aaaccaaaag gcttgagagt gtgtctttag agagagatct tctctctttt 60 

atcttacgac tctcacttct tatctcaaat ctacttcaac tctatttcca gtctccacat 120 

tttcccacaa atttcaactc ttgttctctt catccaaagt aaaaaacaaa tcgttgcaag 180 

tgaggtttgg ttttggtgtt atagaatt atg aag age ggg aag caa tct teg 232 

Met Lys Ser Gly Lys Gin Ser Ser 
1 5 

caa cct gaa aag ggt act tec agg ate ttg tea ctg act gtc ctg ttt 280 
Gin Pro Glu Lys Gly Thr Ser Arg He Leu Ser Leu Thr Val Leu Phe 
10 15 20 

ate gca ttt tgc ggt ttc tec ttc tac etc ggt ggt ata ttt tgc tct 328 
He Ala Phe Cys Gly Phe Ser Phe Tyr Leu Gly Gly He Phe Cys Ser 
25 30 35 40 

gag aga gac aag att gta gee aag gat gtc aca agg acg act aca aag 376 
Glu Arg Asp Lys He Val Ala Lys Asp Val Thr Arg Thr Thr Thr Lys 
45 50 55 

get gta get tec cct aaa gaa cct aca get act cct att caa ate aaa 4 24 

Ala Val Ala Ser Pro Lys Glu Pro Thr Ala Thr Pro He Gin He Lys 
60 65 70 

tec gtt tct ttc ccg gag tgc ggg tea gag ttc caa gat tac ace ccg 472 
Ser val Ser Phe Pro Glu Cys Gly Ser Glu Phe Gin Asp Tyr Thr Pro 
75 80 85 

tgc ace gat cca aag agg tgg aag aag tat ggt gtc cat cgc tta agt 520 
Cys Thr Asp Pro Lys Arg Trp Lys Lys Tyr Gly Val His Arg Leu Ser 
90 95 100 

ttc ttg gag cgt cat tgt cct ccg gta tat gaa aag aat gag tgt ttg 568 
Phe Leu Glu Arg His Cys Pro Pro Val Tyr Glu Lys Asn Glu Cys Leu 
105 110 115 120 

att cca cca cca gac ggg tat aaa ccg cct ata aga tgg ccc aag age 616 
lie Pro Pro Pro Asp Gly Tyr Lys Pro Pro He Arg Trp Pro Lys Ser 
125 130 135 

cga gaa cag tgt tgg tac agg aac gtg cct tat gat tgg ate aat aag 664 
Arg Glu Gin Cys Trp Tyr Arg Asn Val Pro Tyr Asp Trp He Asn Lys 
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140 



MBI-20 Sequence Listing. ST25 
145 150 



caa aag tct 
Gin Lys Ser 
155 



aac cag cat tgg ctt aag aaa gaa gga gat 
Asn Gin His Trp Leu Lys Lys Glu Gly Asp 
160 165 



aag ttc cat 
Lys Phe His 



712 



ttc cct ggt 
Phe Pro Gly 
170 



ggt ggt acc atg ttc cct cgt gga gtt agt 
Gly Gly Thr Met Phe Pro Arg Gly Val Ser 
175 180 



cac tat gtt 
His Tyr Val 



760 



gat ttg atg 
Asp Leu Met 
185 



caa gat ctg att cct gaa atg aaa gac gga 
Gin Asp Leu lie Pro Glu Met Lys Asp Gly 
190 195 



aca gtc agg 
Thr Val Arg 
200 



808 



acc gcc att 
Thr Ala lie 



gat act ggc tgt ggg gtt gcg age tgg gga 
Asp Thr Gly Cys Gly Val Ala Ser Trp Gly 
205 210 



ggc gat ctt 
Gly Asp Leu 
215 



856 



ttg gac cgt 
Leu Asp Arg 



ggg ata eta tea etc tct ctt get cca aga 
Gly lie Leu Ser Leu Ser Leu Ala Pro Arg 
220 225 



gat aac cat 
Asp Asn His 
230 



904 



gaa 
Glu 



get cag 
Ala Gin 
235 



gtt caa 
val Gin 



ttt get 
Phe Ala 



ctt gaa 
Leu Glu 
240 



cgt gga att cct 
Arg Gly lie Pro 
245 



gcg att etc 
Ala He Leu 



952 



ggg 

Gly 



ate ate 
He He 
250 



tct acg 
Ser Thr 



caa cgt 
Gin Arg 
255 



etc cct 
Leu Pro 



ttt cct tea aat 
Phe Pro Ser Asn 
260 



gca ttt gat 
Ala Phe Asp 



1000 



atg get cat tgt tea aga tgt ctt att ccc tgg aca gaa 
Met Ala His Cys Ser Arg Cys Leu He Pro Trp Thr Glu 
265 270 275 



ttt ggt gga 
Phe Gly Gly 
280 



1048 



ate tat tta ctt gag att cac cgt ata gtt cga cct gga 
He Tyr Leu Leu Glu He His Arg He Val Arg Pro Gly 
285 290 



ggt ttt tgg 
Gly Phe Trp 
295 



1096 



gtt ctt tct 
val Leu Ser 



ggt cca 
Gly Pro 
300 



cct gtg 
Pro Val 



aac tat 
Asn Tyr 
305 



aat aga cga tgg 
Asn Arg Arg Trp 



cgt gga tgg 
Arg Gly Trp 
310 



1144 



aac aca acc 
Asn Thr Thr 
315 



atg gaa 
Met Glu 



gat cag 
Asp Gin 



aaa tct 
Lys Ser 
320 



gac tac 
Asp Tyr 



aac aag 
Asn Lys 
325 



ctt cag tea 
Leu Gin Ser 



1192 



ctt eta acc 
Leu Leu Thr 
330 



tec atg 
Ser Met 



tgt ttc 
Cys Phe 
335 



aaa aag 
Lys Lys 



tac get 
Tyr Ala 



caa aaa 
Gin Lys 
340 



gat gac ata 
Asp Asp He 



1240 



gcc gtg tgg 
Ala Val Trp 
345 



cag aaa 
Gin Lys 



etc tea 
Leu Ser 
350 



gac aaa 
Asp Lys 



tct tgc 
Ser Cys 
355 



tat gac 
Tyr Asp 



aaa 
Lys 



ate 
He 



get 
Ala 
360 



1288 



aag aac atg 
Lys Asn Met 



gaa get 
Glu Ala 
365 



tac cct 
Tyr Pro 



ccc aaa 
Pro Lys 



tgt gac 
Cys Asp 
370 



gac agt 
Asp Ser 



ata 
He 



gaa 
Glu 
375 



CCt 
Pro 



1336 



gat tct get 
Asp Ser Ala 



tgg tac 
Trp Tyr 
380 



act cca 
Thr Pro 



etc cgt 
Leu Arg 
385 



cct tgc 
Pro Cys 



gtg gtt 
Val Val 



gcc 
Ala 
390 



ccg 
Pro 



aca 
Thr 



1384 



cct aaa gtc aag aag tct ggt etc gga tea ate cca aaa tgg ccc gag 
Pro Lys Val Lys Lys Ser Gly Leu Gly Ser He Pro Lys Trp Pro Glu 
395 400 405 



1432 



agg 
Arg 



tta cat 
Leu His 
410 



gtc gcg 
Val Ala 



ccc gag 
Pro Glu 
415 



aga ate 
Arg He 



ggt gat gtt cac 
Gly Asp Val His 
420 



gga ggg agt 

Gly Gly Ser 



1480 



gcg aac agt ttg aaa cac gat gat ggt aaa tgg aag aac 
Ala Asn Ser Leu Lys His Asp Asp Gly Lys Trp Lys Asn 
425 430 435 



aga gtt aag 
Arg Val Lys 
440 



1528 



cat tac aag aaa gtt tta cca get ctt 



ggg aca gac aag 
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His Tyr Lys Lys Val Leu Pro Ala Leu Gly Thr Asp Lys lie Arg Asn 
445 450 455 

gtt atg gat atg aac act gtt tat gga ggt ttc tct gcg gcc etc att 1624 
val Met Asp Met Asn Thr Val Tyr Gly Gly Phe Ser Ala Ala Leu lie 
460 465 470 

gag gat ccc att tgg gtc atg aac gtt gta tea teg tac age gca aat 1672 
Glu Asp Pro lie Trp Val Met Asn Val Val Ser Ser Tyr Ser Ala Asn 
475 480 485 

teg ctt cct gtt gtc ttt gat cgc ggt etc ate ggg act tac cac gac 1720 
Ser Leu Pro Val Val Phe Asp Arg Gly Leu He Gly Thr Tyr His Asp 
490 495 500 

tgg tgc gaa get ttc tea acg tat cca aga aca tat gat ctt ctt cac 1768 
Trp Cys Glu Ala Phe Ser Thr Tyr Pro Arg Thr Tyr Asp Leu Leu His 
505 510 515 520 

etc gac agt ctt ttt acc ttg gag agt cac agg tgt gag atg aag tac 1816 
Leu Asp Ser Leu Phe Thr Leu Glu Ser His Arg Cys Glu Met Lys Tyr 
525 530 535 

att ttg eta gag atg gac agg ate ttg egg ccg agt gga tat gtt ata 1864 
He Leu Leu Glu Met Asp Arg He Leu Arg Pro Ser Gly Tyr Val He 
540 545 550 



ate cga gaa teg agt tat ttc atg gac gca ate aca acg tta gcg aaa 
He Arg Glu Ser Ser Tyr Phe Met Asp Ala He Thr Thr Leu Ala Lys 
555 560 565 



<210> 36 

<211> 603 

<212> PRT 

<213> Arabidopsis thaliana 

<400> 36 

Met Lys Ser Gly Lys Gin Ser Ser Gin Pro Glu Lys Gly Thr Ser Arg 
1 5 10 15 

He Leu Ser Leu Thr Val Leu Phe He Ala Phe Cys Gly Phe Ser Phe 
20 25 30 

Tyr Leu Gly Gly He Phe Cys Ser Glu Arg Asp Lys He Val Ala Lys 
35 40 45 

Asp val Thr Arg Thr Thr Thr Lys Ala Val Ala Ser Pro Lys Glu Pro 
50 55 60 
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ggg ata agg tgg agt tgc egg aga gag gag act gag tat gca gtc aaa 1960 
Gly He Arg Trp Ser Cys Arg Arg Glu Glu Thr Glu Tyr Ala Val Lys 
570 575 580 

agt gag aag att ctg gtt tgc cag aaa aag eta tgg ttt teg tea aac 2008 
Ser Glu Lys lie Leu val Cys Gin Lys Lys Leu Trp Phe Ser Ser Asn 
585 590 595 600 

caa acc tct tga tgagaccacc tgtatcatag tgtttatcat ctcctgtgat 2060 
Gin Thr Ser 



gcacactaca 


gagagaagga 


tctagtcctt 


tgagtccaag 


atatagctct 


ataaacaatc 


2120 


tccttttttt 


gttctcttta atttcttggg 


tatttcaegg 


tatagattga 


tattatatat 


2180 


tttttaatta 


tatttttaat 


atatagatat 


attagtatgt 


ggtttaaaca 


ctattattat 


2240 


caaggtctta 


aagatttget 


ttgcaagagt 


taaaaaatgt 


tggagtaagg 


acctcttgat 


2300 


taataaattg 


actgaegcag 


caaa 








2324 
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Thr Ala Thr Pro lie Gin lie Lys Ser Val Ser Phe Pro Glu Cys Gly 
65 70 75 80 

Ser Glu Phe Gin Asp Tyr Thr Pro Cys Thr Asp Pro Lys Arg Trp Lys - 

85 90 95 

Lys Tyr Gly Val His Arg Leu Ser Phe Leu Glu Arg His Cys Pro Pro 
100 105 110 

Val Tyr Glu Lys Asn Glu Cys Leu lie Pro Pro Pro Asp Gly Tyr Lys 
115 120 125 

Pro Pro lie Arg Trp Pro Lys Ser Arg Glu Gin Cys Trp Tyr Arg Asn 
130 135 140 

Val Pro Tyr Asp Trp lie Asn Lys Gin Lys Ser Asn Gin His Trp Leu 
145 150 155 160 

Lys Lys Glu Gly Asp Lys Phe His Phe Pro Gly Gly Gly Thr Met Phe 

165 , 170 175 

Pro Arg Gly Val Ser His Tyr Val Asp Leu Met Gin Asp Leu lie Pro 
180 185 190 

Glu Met Lys Asp Gly Thr Val Arg Thr Ala lie Asp Thr Gly Cys Gly 
195 200 205 

Val Ala Ser Trp Gly Gly Asp Leu Leu Asp Arg Gly lie Leu Ser Leu 
210 215 220 

Ser Leu Ala Pro Arg Asp Asn His Glu Ala Gin Val Gin Phe Ala Leu 
225 230 235 240 

Glu Arg Gly lie Pro Ala He Leu Gly He He Ser Thr Gin Arg Leu 

245 250 255 

Pro Phe Pro Ser Asn Ala Phe Asp Met Ala His Cys Ser Arg Cys Leu 
260 265 270 

He Pro Trp Thr Glu Phe Gly Gly He Tyr Leu Leu Glu He His Arg 
275 280 285 

He Val Arg Pro Gly Gly Phe Trp Val Leu Ser Gly Pro Pro Val Asn 
290 295 300 

Tyr Asn Arg Arg Trp Arg Gly Trp Asn Thr Thr Met Glu Asp Gin Lys 
305 310 315 320 

Ser Asp Tyr Asn Lys Leu Gin Ser Leu Leu Thr Ser Met Cys Phe Lys 
325 330 335 

Lys Tyr Ala Gin Lys Asp Asp He Ala Val Trp Gin Lys Leu Ser Asp 
340 345 350 

Lys Ser Cys Tyr Asp Lys He Ala Lys Asn Met Glu Ala Tyr Pro Pro 
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355 360 365 

Lys Cys Asp Asp Ser lie Glu Pro Asp Ser Ala Trp Tyr Thr Pro Leu 
370 375 380 

Arg Pro Cys Val Val Ala Pro Thr Pro Lys Val Lys Lys Ser Gly Leu 
385 390 395 400 

Gly Ser lie Pro Lys Trp Pro Glu Arg Leu His Val Ala Pro Glu Arg 
405 410 415 

He Gly Asp Val His Gly Gly Ser Ala Asn Ser Leu Lys His Asp Asp 
420 425 430 

Gly Lys Trp Lys Asn Arg Val Lys His Tyr Lys Lys Val Leu Pro Ala 
435 440 445 

Leu Gly Thr Asp Lys He Arg Asn Val Met Asp Met Asn Thr Val Tyr 
450 455 460 

Gly Gly Phe Ser Ala Ala Leu He Glu Asp Pro He Trp Val Met Asn 
465 470 475 480 

Val Val Ser Ser Tyr Ser Ala Asn Ser Leu Pro Val Val Phe Asp Arg 
485 490 495 

Gly Leu He Gly Thr Tyr His Asp Trp Cys Glu Ala Phe Ser Thr Tyr 
500 505 510 

Pro Arg Thr Tyr Asp Leu Leu His Leu Asp Ser Leu Phe Thr Leu Glu 
515 520 525 

Ser His Arg Cys Glu Met Lys Tyr He Leu Leu Glu Met Asp Arg He 
530 535 540 

Leu Arg Pro Ser Gly Tyr Val He He Arg Glu Ser Ser Tyr Phe Met 
545 550 555 560 

Asp Ala lie Thr Thr Leu Ala Lys Gly He Arg Trp Ser Cys Arg Arg 
565 570 575 

Glu Glu Thr Glu Tyr Ala Val Lys Ser Glu Lys He Leu Val Cys Gin 
580 585 590 

Lys Lys Leu Trp Phe Ser Ser Asn Gin Thr Ser 
595 600 



<210> 


37 


<211> 


1951 


<212> 


DNA 


<213> 


Arabidopsis thaliana 


<220> 




<221> 


CDS 


<222> 


(196) . . (1794) 


<223> 


G308 


<400> 


37 
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agtaatttag tttttctttt ttttttttac aatttatttt gttattagaa gtggtagtgg 60 

agtgaaaaaa caaatcctaa gcagtcctaa ccgatccccg aagctaaaga ttcttcacct 120 

tcccaaataa agcaaaacct agatccgaca ttgaaggaaa aaccttttag atccatctct 180 

gaaaaaaacc caacc atg aag aga gat cat cat cat cat cat caa gat aag 231 

Met Lys Arg Asp His His His His His Gin Asp Lys 
15 10 

aag act atg atg atg aat gaa gaa gac gac ggt aac ggc atg gat gag 279 
Lys Thr Met Met Met Asn Glu Glu Asp Asp Gly Asn Gly Met Asp Glu 
15 20 25 

ctt eta get gtt ctt ggt tac aag gtt agg tea teg gaa atg get gat 327 
Leu Leu Ala Val Leu Gly Tyr Lys Val Arg Ser Ser Glu Met Ala Asp 
30 35 40 

gtt get cag aaa etc gag cag ctt gaa gtt atg atg tct aat gtt caa 375 
Val Ala Gin Lys Leu Glu Gin Leu Glu Val Met Met Ser Asn Val Gin 
45 50 55 60 

gaa gac gat ctt tct caa etc get act gag act gtt cac tat aat ccg 423 
Glu Asp Asp Leu Ser Gin Leu Ala Thr Glu Thr Val His Tyr Asn Pro 
65 70 75 

gcg gag ctt tac acg tgg ctt gat tct atg etc ace gac ctt aat cct 471 
Ala Glu Leu Tyr Thr Trp Leu Asp Ser Met Leu Thr Asp Leu Asn Pro 
80 85 90 

ccg teg tct aac gec gag tac gat ctt aaa get att ccc ggt gac gcg 519 
Pro Ser Ser Asn Ala Glu Tyr Asp Leu Lys Ala lie Pro Gly Asp Ala 
95 100 105 

att etc aat cag ttc get ate gat teg get tct teg tct aac caa ggc 567 
lie Leu Asn Gin Phe Ala He Asp Ser Ala Ser Ser Ser Asn Gin Gly 
110 115 120 

9gc gga gga gat acg tat act aca aac aag egg ttg aaa tgc tea aac 615 
Gly Gly Gly Asp Thr Tyr Thr Thr Asn Lys Arg Leu Lys Cys Ser Asn 
125 130 135 140 

ggc gtc gtg gaa acc ace aca gcg acg get gag tea act egg cat gtt 663 
Gly Val Val Glu Thr Thr Thr Ala Thr Ala Glu Ser Thr Arg His Val 
145 150 155 

gtc ctg gtt gac teg cag gag aac ggt gtg cgt etc gtt cac gcg ctt 711 
Val Leu Val Asp Ser Gin Glu Asn Gly Val Arg Leu Val His Ala Leu 
160 165 170 

ttg get tgc get gaa get gtt cag aag gag aat ctg act gtg gcg gaa 759 
Leu Ala Cys Ala Glu Ala Val Gin Lys Glu Asn Leu Thr Val Ala Glu 
175 180 185 

get ctg gtg aag caa ate gga ttc tta get gtt tct caa ate gga get 807 
Ala Leu Val Lys Gin He Gly Phe Leu Ala Val Ser. Gin He Gly Ala 
190 195 200 

atg aga caa gtc get act tac ttc gec gaa get etc gcg egg egg att 855 
Met Arg Gin Val Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg He 
205 210 215 220 

tac cgt etc tct ccg teg cag agt cca ate gac cac tct etc tec gat 903 
Tyr Arg Leu Ser Pro Ser Gin Ser Pro He Asp His Ser Leu Ser Asp 
225 230 235 

act ctt cag atg cac ttc tac gag act tgt cct tat etc aag ttc get 951 
Thr Leu Gin Met His Phe Tyr Glu Thr Cys Pro Tyr Leu Lys Phe Ala 
240 245 250 

cac ttc acg gcg aat caa gcg att etc gaa get ttt caa ggg aag aaa 999 
His Phe Thr Ala Asn Gin Ala He Leu Glu Ala Phe Gin Gly Lys Lys 
255 260 265 

aga gtt cat gtc att gat ttc tct atg agt caa ggt ctt caa tgg ccg 1047 
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Arg Val His Val He Asp Phe Ser Met Ser Gin Gly Leu Gin Trp Pro 
270 275 280 

gcg ctt atg cag get ctt gcg ctt cga cct ggt ggt cct cct gtt ttc 1095 
Ala Leu Met Gin Ala Leu Ala Leu Arg Pro Gly Gly Pro Pro Val Phe 
285 290 295 300 

egg tta acc gga att ggt cca ccg gca ccg gat aat ttc gat tat ctt 1143 
Arg Leu Thr Gly He Gly Pro Pro Ala Pro Asp Asn Phe Asp Tyr Leu 
305 310 315 

cat gaa gtt ggg tgt aag ctg get cat tta get gag gcg att cac gtt 1191 
His Glu Val Gly Cys Lys Leu Ala His Leu Ala Glu Ala lie His Val 
320 325 330 

gag ttt gag tac aga gga ttt gtg get aac act tta get gat ctt gat 123 9 
Glu Phe Glu Tyr Arg Gly Phe Val Ala Asn Thr Leu Ala Asp Leu Asp 
335 340 345 

get teg atg ctt gag ctt aga cca agt gag att gaa tct gtt gcg gtt 1287 
Ala Ser Met Leu Glu Leu Arg Pro Ser Glu lie Glu Ser Val Ala Val 
350 355 360 

aac tct gtt ttc gag ctt cac aag etc ttg gga cga cct ggt gcg ate 1335 
Asn Ser Val Phe Glu Leu His Lys Leu Leu Gly Arg Pro Gly Ala He 
365 370 375 380 

gat aag gtt ctt ggt gtg gtg aat cag att aaa ccg gag att ttc act 1383 
Asp Lys Val Leu Gly Val Val Asn Gin He Lys Pro Glu He Phe Thr 
385 390 395 

gtg gtt gag cag gaa teg aac cat aat agt ccg att ttc tta gat egg 1431 
Val Val Glu Gin Glu Ser Asn His Asn Ser Pro He Phe Leu Asp Arg 
400 405 410 

ttt act gag teg ttg cat tat tac teg acg ttg ttt gac teg ttg gaa 1479 
Phe Thr Glu Ser Leu His Tyr Tyr Ser Thr Leu Phe Asp Ser Leu Glu 
415 420 425 

ggt gta ccg agt ggt caa gac aag gtc atg teg gag gtt tac ttg ggt 1527 
Gly Val Pro Ser Gly Gin Asp Lys Val Met Ser Glu Val Tyr Leu Gly 
430 435 440 

aaa cag ate tgc aac gtt gtg get tgt gat gga cct gac cga gtt gag 1575 
Lys Gin He Cys Asn Val Val Ala Cys Asp Gly Pro Asp Arg Val Glu 
445 450 455 460 

cgt cat gaa acg ttg agt cag tgg agg aac egg ttc ggg tct get ggg 1623 
Arg His Glu Thr Leu Ser Gin Trp Arg Asn Arg Phe Gly Ser Ala Gly 
465 470 475 

ttt gcg get gca cat att ggt teg aat gcg ttt aag caa gcg agt atg 1671 
Phe Ala Ala Ala His He Gly Ser Asn Ala Phe Lys Gin Ala Ser Met 
480 485 490 

ctt ttg get ctg ttc aac ggc ggt gag ggt tat egg. gtg gag gag agt 1719 
Leu Leu Ala Leu Phe Asn Gly Gly Glu Gly Tyr Arg Val Glu Glu Ser 
495 500 505 

gac ggc tgt etc atg ttg ggt tgg cac aca cga ccg etc ata gee acc 1767 
Asp Gly Cys Leu Met Leu Gly Trp His Thr Arg Pro Leu He Ala Thr 
510 515 520 

teg get tgg aaa etc tec acc aat tag atggtggctc aatgaattga 1814 
Ser Ala Trp Lys Leu Ser Thr Asn 
525 530 

tctgttgaac eggttatgat gatagatttc cgaccgaagc caaactaaat cctactgttt 1874 

ttccctttgt cacttgtta'a gatcttatct ttcattatat taggtaattg aaaaatttta 1934 

atctcgccta aattact 1951 

<210> 38 
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<21l> 532 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 38 

Met Lys Arg Asp His His His His His Gin Asp Lys Lys Thr Met Met " 
1 5 10 15 

Met Asn Glu Glu Asp Asp Gly Asn Gly Met Asp Glu Leu Leu Ala Val 
20 25 30 

Leu Gly Tyr Lys Val Arg Ser Ser Glu Met Ala Asp Val Ala Gin Lys 
35 40 45 

Leu Glu Gin Leu Glii Val Met Met Ser Asn Val Gin Glu Asp Asp Leu 
50 55 60 

Ser Gin Leu Ala Thr Glu Thr val His Tyr Asn Pro Ala Glu Leu Tyr 
65 70 75 80 

Thr Trp Leu Asp Ser Met Leu Thr Asp Leu Asn Pro Pro Ser Ser Asn 
85 90 95 

Ala Glu Tyr Asp Leu Lys Ala lie Pro Gly Asp Ala He Leu Asn Gin 
100 105 110 

Phe Ala He Asp Ser Ala Ser Ser Ser Asn Gin Gly Gly Gly Gly Asp 
115 120 125 

Thr Tyr Thr Thr Asn Lys Arg Leu Lys Cys Ser Asn Gly Val Val Glu 
130 135 140 

Thr Thr Thr Ala Thr Ala Glu Ser Thr Arg His Val Val Leu Val Asp 
145 150 155 160 

Ser Gin Glu Asn Gly Val Arg Leu Val His Ala Leu Leu Ala Cys Ala 
165 170 175 

Glu Ala Val Gin Lys Glu Asn Leu Thr Val Ala Glu Ala Leu Val Lys 
180 185 190 

Gin lie Gly Phe Leu Ala Val Ser Gin He Gly Ala Met Arg Gin Val 
195 200 205 

Ala Thr Tyr Phe Ala Glu Ala Leu Ala Arg Arg He Tyr Arg Leu Ser 
210 215 220 

Pro Ser Gin Ser Pro He Asp His Ser Leu Ser Asp Thr Leu Gin Met 
225 230 235 240 

His Phe Tyr Glu Thr Cys Pro Tyr Leu Lys Phe Ala His Phe Thr Ala 
245 250 255 

Asn Gin Ala He Leu Glu Ala Phe Gin Gly Lys Lys Arg Val His Val 
260 265 270 



He Asp Phe Ser Met Ser Gin Gly Leu 



Gin Trp Pro Ala Leu Met Gin 
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275 280 285 



Ala Leu Ala Leu Arg Pro Gly Gly Pro Pro Val Phe Arg Leu Thr Gly 
290 295 300 



He Gly Pro Pro Ala Pro Asp Asn Phe Asp Tyr Leu His Glu Val Gly 
305 3X0 315 320 

Cys Lys Leu Ala His Leu Ala Glu Ala He His Val Glu Phe Glu Tyr 
325 330 335 

Arg Gly Phe Val Ala Asn Thr Leu Ala Asp Leu Asp Ala Ser Met Leu 
340 345 350 



Glu Leu Arg Pro Ser Glu He Glu Ser Val Ala Val Asn Ser Val Phe 
355 360 365 

Glu Leu His Lys Leu Leu Gly Arg Pro Gly Ala He Asp Lys Val Leu 
370 375 380 

Gly Val Val Asn Gin He Lys Pro Glu He Phe Thr Val Val Glu Gin 
385 390 395 400 

Glu Ser Asn His Asn Ser Pro He Phe Leu Asp Arg Phe Thr Glu Ser 
405 410 415 

Leu His Tyr Tyr Ser Thr Leu Phe Asp Ser Leu Glu Gly Val Pro Ser 
420 425 430 

Gly Gin Asp Lys Val Met Ser Glu Val Tyr Leu Gly Lys Gin He Cys 
435 440 445 

Asn Val val Ala Cys Asp Gly Pro Asp Arg Val Glu Arg His Glu Thr 
450 455 460 

Leu Ser Gin Trp Arg Asn Arg Phe Gly Ser Ala Gly Phe Ala Ala Ala 
465 470 475 480 

His He Gly Ser Asn Ala Phe Lys Gin Ala Ser Met Leu Leu Ala Leu 
485 490 495 



Phe Asn Gly Gly Glu Gly Tyr Arg Val Glu Glu Ser Asp Gly Cys Leu 
500 505 510 

Met Leu Gly Trp His Thr Arg Pro Leu He Ala Thr Ser Ala Trp Lys 
515 520 525 



Leu Ser Thr Asn 
530 



<210> 39 

<211> 1445 

<212> DNA 

<213> Arabidopsis thaliana 
<220> 

<221> CDS 

c222> (236) . . (1306) 
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<223> G1944 
<400> 39 



tcgaccttcc 


taatttccaa 


cctctgttct 


tagcaatata 


ttttttctcc 


aaaaataatt 


60 


ctcagtttga 


ttttcttctt 


ctagctctta 


agtatatttc 


tttgttgtta 


tttatctttr 


120 


aatcctttaa 


tctcatcttt 


gtttatcttt 


aatcaaaacc 


caaaatttac 


atgggttctt 


180 


gaaaatctag 


aagaaataaa 


ggaaacataa 


caaaaataga 


aagaaaaaga 


agcta atg 


238 



Met 
1 

gtc tta aat atg gag tct acc gga gaa get gtt aga tea ace ace ggt 286 
Val Leu Asn Met Glu Ser Thr Gly Glu Ala Val Arg Ser Thr Thr Gly 
5 10 15 

aac gac ggt ggt att acg gtg gtt aga tec gac gcg ccg tea gat ttc 334 
Asn Asp Gly Gly lie Thr Val Val Arg Ser Asp Ala Pro Ser Asp Phe 
20 25 30 

cac gta get caa aga tea gaa age tea aac caa tct ccc acc tct gtc 382 
His Val Ala Gin Arg Ser Glu Ser Ser Asn Gin Ser Pro Thr Ser Val 
35 40 45 

act cct cct cca cca cag cca teg tct cat cac aca get cct ccg ccg 430 
Thr Pro Pro Pro Pro Gin Pro Ser Ser His His Thr Ala Pro Pro Pro 
50 55 60 65 

ctg caa att teg acg gtg acg act acg act acg acg gee gcg atg gaa 478 
Leu Gin He Ser Thr Val Thr Thr Thr Thr Thr Thr Ala Ala Met Glu 
70 75 80 

ggt ate tec ggt gga ctg atg aag aag aag cgt gga egg cca agg aag 526 
Gly He Ser Gly Gly Leu Met Lys Lys Lys Arg Gly Arg Pro Arg Lys 
85 90 95 

tat gga ccg gac ggg act gtt gta gcg tta tct cct aaa ccg att tea 574 
Tyr Gly Pro Asp Gly Thr Val Val Ala Leu Ser Pro Lys Pro He Ser 
100 105 110 

tea gcg ccg gcg ccg teg cat ctt ccg ccg ccg agt tea cac gtc ate 622 
Ser Ala Pro Ala Pro Ser His Leu Pro Pro Pro Ser Ser His Val He 
115 120 125 

gat ttc tec get tct gag aaa cgt age aaa gtg aaa cca acg aac teg 670 
Asp Phe Ser Ala Ser Glu Lys Arg Ser Lys Val Lys Pro Thr Asn Ser 
130 135 140 145 

ttt aac aga aca aag tat cat cac caa gtt gag aat ttg ggt gaa tgg 718 
Phe Asn Arg Thr Lys Tyr His His Gin Val Glu Asn Leu Gly Glu Trp 
150 155 160 

get cct tgc tec gtc ggt ggt aat ttc aca cct cat ata ate aca gtc 766 
Ala Pro Cys Ser Val Gly Gly Asn Phe Thr Pro His He He Thr Val 
165 170 175 

aac acc ggc gag gat gta aca atg aag ata ate teg ttt teg caa caa 814 
Asn Thr Gly Glu Asp Val Thr Met Lys He He Ser Phe Ser Gin Gin 
180 185 190 

gga cct cgc tct att tgt gtt ctg tea gca aac ggt gtt att tea age 862 
Gly Pro Arg Ser He Cys Val Leu Ser Ala Asn Gly Val He Ser Ser 
195 200 205 

gtt aca ctt cgt cag cca gat tec tct ggc ggc aca ttg aca tac gaa 910 
Val Thr Leu Arg Gin Pro Asp Ser Ser Gly Gly Thr Leu Thr Tyr Glu 
210 215 220 225 

ggt egg ttt gag ata tta tea tta tec ggg tea ttc atg cct aat gat 958 
Gly Arg Phe Glu lie Leu Ser Leu Ser Gly Ser Phe Met Pro Asn Asp 
230 235 240 

tea ggc gga aca cga agt aga acg gga gga atg agt gta teg tta gca 1006 
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Ser Gly Gly Thr Arg Ser Arg Thr Gly Gly Met Ser Val Ser Leu Ala 
245 250 255 

agt ccc gat gga cgt gta gta ggc ggt ggc etc gec ggt tta eta gta 1054 
Ser Pro Asp Gly Arg Val Val Gly Gly Gly Leu Ala Gly Leu Leu Val 
260 265 270 

gec gcg agt ccg gtt cag gtg gtt gta gga agt ttt tta gcg ggc act 1102 
Ala Ala Ser Pro Val Gin Val Val Val Gly Ser Phe Leu Ala Gly Thr 
275 280 285 

gac cat caa gat cag aaa ccg aaa aag aac aaa cat gat ttc atg ttg 1150 
Asp His Gin Asp Gin Lys Pro Lys Lys Asn Lys His Asp Phe Met Leu 
290 295 300 305 

teg agt cct acc get gca att cct ate tct agt gca get gat cac egg 1198 
Ser Ser Pro Thr Ala Ala He Pro He Ser Ser Ala Ala Asp His Arg 
310 315 320 

aca ate cat teg gtc teg tct ctt ccg gtc aat aat aat aca tgg cag 1246 
Thr He His Ser Val Ser Ser Leu Pro Val Asn Asn Asn Thr Trp Gin 
325 330 335 

act tct tta get tec gat cca aga aac aag cat acc gat att aat gtc 1294 
Thr Ser Leu Ala Ser Asp Pro Arg Asn Lys His Thr Asp He Asn Val 
340 345 350 

aat gta act tga aatccaatct ttctctgtat tttctgttaa caagtttgat 1346 
Asn Val Thr 
355 

ttggttgttt atctacatta ggattttact aaaatggtag tattatttat agggttttag 1406 
ggtctttatt ttggttccac tgttgtcact tgtaggata 1445 

<210> 40 
<211> 356 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 40 

Met Val Leu Asn Met Glu Ser Thr Gly Glu Ala Val Arg Ser Thr Thr 
15 10 15 

Gly Asn Asp Gly Gly He Thr Val Val Arg Ser Asp Ala Pro Ser Asp 
20 25 30 

Phe His Val Ala Gin Arg Ser Glu Ser Ser Asn Gin Ser Pro Thr Ser 
35 40 45 

Val Thr Pro Pro Pro Pro Gin Pro Ser Ser His His Thr Ala Pro Pro 
50 55 60 

Pro Leu Gin He Ser Thr Val Thr Thr Thr Thr Thr Thr Ala Ala Met 
65 70 75 80 

Glu Gly He Ser Gly Gly Leu Met Lys Lys Lys Arg Gly Arg Pro Arg 
85 90 95 

Lys Tyr Gly Pro Asp Gly Thr Val Val Ala Leu Ser Pro Lys Pro He 
100 105 110 

Ser Ser Ala Pro Ala Pro Ser His Leu Pro Pro Pro Ser Ser His Val 
115 120 125 
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He Asp Phe Ser Ala Ser Glu Lys Arg Ser Lys Val Lys Pro Thr Asn 
130 135 140 

Ser Phe Asn Arg Thr Lys Tyr His His Gin Val Glu Asn Leu Gly Glu 
145 150 155 160 

Trp Ala Pro Cys Ser Val Gly Gly Asn Phe Thr Pro His He He Thr 
165 170 175 

Val Asn Thr Gly Glu Asp Val Thr Met Lys lie He Ser Phe Ser Gin 
180 185 190 

Gin Gly Pro Arg Ser He Cys Val Leu Ser Ala Asn Gly Val He Ser 
195 200 205 

Ser Val Thr Leu Arg Gin Pro Asp Ser Ser Gly Gly Thr Leu Thr Tyr 
210 215 220 

Glu Gly Arg Phe Glu He Leu Ser Leu Ser Gly Ser Phe Met Pro Asn 
225 230 235 240 

Asp Ser Gly Gly Thr Arg Ser Arg Thr Gly Gly Met Ser Val Ser Leu 
245 250 255 

Ala Ser Pro Asp Gly Arg Val Val Gly Gly Gly Leu Ala Gly Leu Leu 
260 265 270 

Val Ala Ala Ser Pro Val Gin Val Val Val Gly Ser Phe Leu Ala Gly 
275 280 285 

Thr Asp His Gin Asp Gin Lye Pro Lys Lys Asn Lys His Asp Phe Met 
290 295 300 

Leu Ser Ser Pro Thr Ala Ala He Pro He Ser Ser Ala Ala Asp His 
305 310 315 320 

Arg Thr He His Ser Val Ser Ser Leu Pro Val Asn Asn Asn Thr Trp 
325 330 335 

Gin Thr Ser Leu Ala Ser Asp Pro Arg Asn Lys His Thr Asp He Asn 
340 345 350 

Val Asn Val Thr 





355 


<210> 


41 


<211> 


1558 


<212> 


DNA 


<213> 


Arabidopsis thai i ana 


<220> 




<221> 


CDS 


<222> 


(191) . . (1396) 


<223> 


G326 


<400> 


41 



caattaatga catcttcttc ttctcctttc actgcaaaac cgaaagcttg agactttgag 60 

attatgtcta tgtcatcttc ttcttcttcc atcgatcact tcatcacctt tcgtcatctt 120 
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gatcttattc tccactgtat aaaatcagcg agattttaag ggattgtgaa ggtaccatct 180 

taaacacaaa atg ggt act tct act aca gag agt gtg gtg gcg tgt gaa 229 
Met Gly Thr Ser Thr Thr Glu Ser Val Val Ala Cys Glu 
15 10 

ttt tgc ggc gag aga acg gcg gtt ctg ttt tgt aga gcc gat acg gcg 277 
Phe Cys Gly Glu Arg Thr Ala Val Leu Phe Cys Arg Ala Asp Thr Ala 
15 20 25 

aag ctt tgt ttg cct tgt gac cag cac gtg cac teg gcg aac ctt etc 325 
Lys Leu Cys Leu Pro Cys Asp Gin His Val His Ser Ala Asn Leu Leu 
30 35 40 45 

teg 999 aag cat gtt cgt tct cag ate tgt gat aac tgt age aaa gag 373 
Ser Arg Lys His Val Arg Ser Gin lie Cys Asp Asn Cys Ser Lys Glu 
50 55 60 

ccg gtg tec gta cgt tgc ttc aca gat aat etc gta ttg tgt cag gag 421 
Pro Val Ser Val Arg Cys Phe Thr Asp Asn Leu Val Leu Cys Gin Glu 
65 70 75 

tgt gat tgg gat gtt cac gga age tgt tec tec tec gcg acg cat gaa 469 
Cys Asp Trp Asp Val His Gly Ser Cys Ser Ser Ser Ala Thr His Glu 
80 85 90 

cgc tec gcc gtg gaa ggg ttt tea ggt tgt cct teg gtt ttg gag ctt 517 
Arg Ser Ala Val Glu Gly Phe Ser Gly Cys Pro Ser Val Leu Glu Leu 
95 100 105 

get get gtg tgg gga ate gat tta aag ggt aag aag aaa gaa gat gac 565 
Ala Ala Val Trp Gly lie Asp Leu Lys Gly Lys Lys Lys Glu Asp Asp 
110 115 120 125 

gaa gac gaa ttg act aag aat ttt ggg atg ggg ttg gat teg tgg ggt 613 
Glu Asp Glu Leu Thr Lys Asn Phe Gly Met Gly Leu Asp Ser Trp Gly 
130 135 140 

tct gga tct aac ate gtt caa gaa ctg att gtt cct tat gat gtg tct 661 
Ser Gly Ser Asn lie Val Gin Glu Leu lie Val Pro Tyr Asp Val Ser 
145 150 155 

tgc aaa aag caa age ttt age ttt ggg agg tct aag cag gta gtg ttt 709 
Cys Lys Lys Gin Ser Phe Ser Phe Gly Arg Ser Lys Gin Val Val Phe 
160 165 170 

gaa cag ctt gag tta ctg aag aga ggc ttc gtt gaa ggc gaa gga gag 757 
Glu Gin Leu Glu Leu Leu Lys Arg Gly Phe Val Glu Gly Glu Gly Glu 
175 180 185 

att atg gtt ccg gag gga ate aat ggc gga gga age att tct cag cca 805 
lie Met Val Pro Glu Gly lie Asn Gly Gly Gly Ser lie Ser Gin Pro 
190 195 200 205 

tct ccg acg acg teg ttt act tct ttg ctt atg tct caa agt ctt tgt 853 
Ser Pro Thr Thr Ser Phe Thr Ser Leu Leu Met Ser Gin Ser Leu Cys 
210 215 220 

ggt aat ggt atg caa tgg aat get act aat cat age act ggc cag aac 901 
Gly Asn Gly Met Gin Trp Asn Ala Thr Asn His Ser Thr Gly Gin Asn 
225 230 235 

act cag ata tgg gat ttt aac ttg gga cag teg agg aac cct gat gaa 94 9 

Thr Gin He Trp Asp Phe Asn Leu Gly Gin Ser Arg Asn Pro Asp Glu 
240 245 250 

cct agt cca gtc gaa act aaa ggc tct act ttc aca ttc aac aac gtt 997 
Pro Ser Pro Val Glu Thr Lys Gly Ser Thr Phe Thr Phe Asn Asn Val 
255 260 265 

act cat etc aag aac gat ace cga ace ace aat atg aat get ttc aaa 1045 
Thr His Leu Lys Asn Asp Thr Arg Thr Thr Asn Met Asn Ala Phe Lys 
270 275 280 285 
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gag agt tac cag gag gat tec gtc cac tea act tct acc aag gga cag 1093 
Glu Ser Tyr Gin Glu Asp Ser Val His Ser Thr Ser Thr Lys Gly Gin 
290 295 300 

gaa aca tct aag age aac aat att cct get gee att cac teg cat aaa 1141 
Glu Thr Ser Lys Ser Asn Asn lie Pro Ala Ala He His Ser His Lys 
305 310 315 

agt tct aac gac tec tgt ggc ttg cat tgc acg gaa cat att get att 1189 
Ser Ser Asn Asp Ser Cys Gly Leu His Cys Thr Glu His He Ala lie 
320 325 330 

act agt aat aga gec aca aga ttg gtg gcg gta acg aat get gat eta 1237 
Thr Ser Asn Arg Ala Thr Arg Leu Val Ala Val Thr Asn Ala Asp Leu 
335 340 345 

gag cag atg gca cag aac aga gat aat get atg cag egg tac aag gaa 1285 
Glu Gin Met Ala Gin Asn Arg Asp Asn Ala Met Gin Arg Tyr Lys Glu 
350 ■ 355 360 365 

aag aag aaa acg egg aga tat gat aag acc ata aga tat gaa acg agg 1333 
Lys Lys Lys Thr Arg Arg Tyr Asp Lys Thr He Arg Tyr Glu Thr Arg 
370 375 380 

aag gcg aga gee gag acc agg ttg cgt gtt aag ggc aga ttt gtg aaa 1381 
Lys Ala Arg Ala Glu Thr Arg Leu Arg Val Lys Gly Arg Phe Val Lys 
385 390 395 

get aca gat cct tag atgtctctcc acgttaggtt ttacatttga gatcctaagt 1436 
Ala Thr Asp Pro 
400 

taggaacttt ttttgttttt tctactttca actaccttgt aaatgtaaat gatcgatctt 1496 

cagctgeata atgtgtggcc agatttttgt aatttttacg tttaaccttc taaaaaaaaa 1556 

aa 1558 

<210> 42 
<211> 401 
<212> PRT 

<213> Arabidopsis thaliana 
<400> 42 

Met Gly Thr Ser Thr Thr Glu Ser Val Val Ala Cys Glu Phe Cys Gly 
15 10 15 

Glu Arg Thr Ala Val Leu Phe Cys Arg Ala Asp Thr Ala Lys Leu Cys 
20 25 30 

Leu Pro Cys Asp Gin His Val His Ser Ala Asn Leu Leu Ser Arg Lys 
35 40 45 

His Val Arg Ser Gin He Cys Asp Asn Cys Ser Lys Glu Pro Val Ser 
50 55 60 

Val Arg Cys Phe Thr Asp Asn Leu Val Leu Cys Gin Glu Cys Asp Trp 
65 70 75 80 

Asp Val His Gly Ser Cys Ser Ser Ser Ala Thr His Glu Arg Ser Ala 
85 90 95 

Val Glu Gly Phe Ser Gly Cys Pro Ser Val Leu Glu Leu Ala Ala Val 
100 105 110 



Trp Gly He Asp Leu Lys Gly Lys Lys Lys Glu Asp Asp Glu Asp Glu 
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115 120 125 

Leu Thr Lys Asn Phe Gly Met Gly Leu Asp Ser Trp Gly Ser Gly Ser 
130 135 140 

Asn He Val Gin Glu Leu He Val Pro Tyr Asp Val Ser Cys Lys Lys 
145 150 155 160 

Gin Ser Phe Ser Phe Gly Arg Ser Lys Gin Val Val Phe Glu Gin Leu 
165 170 175 

Glu Leu Leu Lys Arg Gly Phe Val Glu Gly Glu Gly Glu He Met Val 
180 185 190 

Pro Glu Gly He Asn Gly Gly Gly Ser He Ser Gin Pro Ser Pro Thr 
195 200 205 

Thr Ser Phe Thr Ser Leu Leu Met Ser Gin Ser Leu Cys Gly Asn Gly 
210 215 220 

Met Gin Trp Asn Ala Thr Asn His Ser Thr Gly Gin Asn Thr Gin He 
225 230 235 240 

Trp Asp Phe Asn Leu Gly Gin Ser Arg Asn Pro Asp Glu Pro Ser Pro 
245 250 255 

Val Glu Thr Lys Gly Ser Thr Phe Thr Phe Asn Asn Val Thr His Leu 
260 265 270 

Lys Asn Asp Thr Arg Thr Thr Asn Met Asn Ala Phe Lys Glu Ser Tyr 
275 280 285 

Gin Glu Asp Ser Val His Ser Thr Ser Thr Lys Gly Gin Glu Thr Ser 
290 295 300 

Lys Ser Asn Asn He Pro Ala Ala He His Ser His Lys Ser Ser Asn 
305 310 315 320 

Asp Ser Cys Gly Leu His Cys Thr Glu His He Ala He Thr Ser Asn 
325 330 335 

Arg Ala Thr Arg Leu Val Ala Val Thr Asn Ala Asp Leu Glu Gin Met 
340 345 350 

Ala Gin Asn Arg Asp Asn Ala Met Gin Arg Tyr Lys Glu Lys Lys Lys 
355 360 365 

Thr Arg Arg Tyr Asp Lys Thr He Arg Tyr Glu Thr Arg Lys Ala Arg 
370 375 380 

Ala Glu Thr Arg Leu Arg Val Lys Gly Arg Phe Val Lys Ala Thr Asp 
385 390 395 400 

Pro 
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<210> 


43 


<211> 


844 


<212> 


DNA 


<213> 


Arabidopsis 


<220> 




<221> 


CDS 


<222> 


(89) . . (658) 


<223> 


G1387 


<400> 


43 



tctctctccc actctcactt tctctcctat tcttagttcg tgtcagaaac acacagagaa 60 

attaagaacc ctaatttaaa acagaaga atg gta cat teg aag aag ttc cga 112 

Met Val His Ser Lys Lys Phe Arg 
1 5 

ggt gtc cgc cag cgt cag tgg ggt tct tgg gtt tct gag att cgt cat 160 
Gly Val Arg Gin Arg Gin Trp Gly Ser Trp Val Ser Glu lie Arg His 
10 15 20 

cct etc ttg aag aga aga gtg tgg eta gga aca ttc gac acg gcg gaa 208 
Pro Leu Leu Lys Arg Arg Val Trp Leu Gly Thr Phe Asp Thr Ala Glu 
25 30 35 . 40 

aca gcg get aga gec tac gac caa gee gcg gtt eta atg aac ggc cag 256 
Thr Ala Ala Arg Ala Tyr Asp Gin Ala Ala Val Leu Met Asn Gly Gin 
45 50 55 

age gcg aag act aac ttc ccc gtc ate aaa teg aac ggt tea aat tec 304 
Ser Ala Lys Thr Asn Phe Pro Val lie Lys Ser Asn Gly Ser Asn Ser 
60 65 70 

tfc g 9 a 9 att aac tct gcg tta agg tct ccc aaa tea tta teg gaa eta 352 
Leu Glu lie Asn Ser Ala Leu Arg Ser Pro Lys Ser Leu Ser Glu Leu 
75 80 85 

ttg aac get aag eta agg aag aac tgt aaa gac cag aca ccg tat ctg 400 
Leu Asn Ala Lys Leu Arg Lys Asn Cys Lys Asp Gin Thr Pro Tyr Leu 
90 95 100 

acg tgt etc cgc etc gac aac gac age tea cac ate ggc gtc tgg cag 448 
Thr Cys Leu Arg Leu Asp Asn Asp Ser Ser His lie Gly Val Trp Gin 
105 110 115 120 

aaa cgc gee ggg tea aaa acg agt cca aac tgg gtc aag ctt gtt gaa 496 
Lys Arg Ala Gly Ser Lys Thr Ser Pro Asn Trp Val Lys Leu Val Glu 
125 130 135 

eta ggt gac aaa gtt aac gca cgt ccc ggt ggt gat att gag act aat 544 
Leu Gly Asp Lys Val Asn Ala Arg Pro Gly Gly Asp lie Glu Thr Asn 
140 145 150 

aag atg aag gta cga aac gaa gac gtt cag gaa gat gat caa atg gcg 592 
Lys Met Lys Val Arg Asn Glu Asp Val Gin Glu Asp Asp Gin Met Ala 
155 160 165 

atg cag atg ate gag gag ttg ctt aac tgg ace tgt cct gga tct gga 640 
Met Gin Met lie Glu Glu Leu Leu Asn Trp Thr Cys Pro Gly Ser 'Gly 
170 175 180 

tec att gca cag gtc taa aggagaatca ttgaattata tgatcaagat 688 

Ser He Ala Gin Val 

185 

aataatatag ttgagggtta ataataatcg agggtaagta atttacgtgt agctaataat 74 8 

taatataatt ttcgaacata tatatgaata tatgatagct ctagaaatga gtaegtatat 808 

ataegtaaac atttttcctc aaatatagta tatgtg 844 

<210> 44 
<211> 189 
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<212> PRT 

<213> Arabidopsis thaliana 
<400> 44 

Met Val His Ser Lys Lys Phe Arg Gly Val Arg Gin Arg Gin Trp Gly 
15 10 15 

Ser Trp Val Ser Glu He Arg His Pro Leu Leu Lys Arg Arg Val Trp 
20 25 30 

Leu Gly Thr Phe Asp Thr Ala Glu Thr Ala Ala Arg Ala Tyr Asp Gin 
35 40 45 

Ala Ala Val Leu Met Asn Gly Gin Ser Ala Lys Thr Asn Phe Pro Val 
50 55 60 

He Lys Ser Asn Gly Ser Asn Ser Leu Glu He Asn Ser Ala Leu Arg 
65 70 75 80 

Ser Pro Lys Ser Leu Ser Glu Leu Leu Asn Ala Lys Leu Arg Lys Asn 
85 90 95 

Cys Lys Asp Gin Thr Pro Tyr Leu Thr Cys Leu Arg Leu Asp Asn Asp 
100 105 110 

Ser Ser His He Gly Val Trp Gin Lys Arg Ala Gly Ser Lys Thr Ser 
115 120 125 

Pro Asn Trp Val Lys Leu Val Glu Leu Gly Asp Lys Val Asn Ala Arg 
130 135 140 

Pro Gly Gly Asp He Glu Thr Asn Lys Met Lys Val Arg Asn Glu Asp 
145 150 155 160 

Val Gin Glu Asp Asp Gin Met Ala Met Gin Met He Glu Glu Leu Leu 
165 170 175 

Asn Trp Thr Cys Pro Gly Ser Gly Ser He Ala Gin Val 
180 185 
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